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Louis 

Joan Patrerson, Washington University 

Loren D. Rew, University of Missouri 

Forrest H. Rose, Southeast Missouri State T. C. 
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Ray E. KeEsey, Ohio State University 

V. A. KetcHaM, Ohio State University 

Vincent S. John Carroll University 

FRANKLIN H. KNower, Ohio State University 

ELIZABETH Koops, Ohio University 

A. C. LAFoLLette, Ohio University 

Rosert A. LANG, Western Reserve University 

CHARLES Layton, Muskingum College 

James I. Lore, Jr., Cleveland Hearing & Speech 
Center 

RutH Lunpin, Western Reserve University 

Marie K. Mason, Ohio State University 

ADELINE E. McCLELLAND, Bowling Green St. Univ. 

MarigE E. MILL, Lash High School, Zanesville 

A. ELIZABETH MILLER, Youngstown Public Schools 

CATHERINE Morris, Ohio State University 

D. W. Morris, Ohio State University 

H. W. NIsencER, Ohio State University 


j 
} 
7 
‘ 
i, 
a 
: 4 
| 
bie 
| 
he 
| 
A 


Briar- 


sity 


rolina 
a 


Uni- 


kota 


ersity 


it of 


Haron B. Osee, Bowling Green State University 

Rosert I. Pearce, Kent State University 

Mary Quirk, Dayton Public Schools 

Rosert Goutp Ritrenour, Kingston 

GALEN Starr Ross, Capitol College of Oratory 
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J. Ohio State University 

Vircinta S. SANDERSON, Ohio State University 
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Joseru W. Scott, Ohio State University 

C. SEIGFRED, Ohio University 
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L. C. Staats, Ohio University 

O. W. StockMAN, United Brethren Church, New 
Lexington 

E. Turner Stump, Kent State University 

Wituiam E. Umsacnu, Case School of Applied 
Science 

Freperick G. WatsH, Bowling Green State Univ. 

Louts N. WetzeL, Cincinnati Bible Seminary 

W. Witey, Ohio State University 

Harry M. WiLuiaMs, Miami University 
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Epwarp WricHTt, Denison University 
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STUDIES IN SPEECH INTELLIGIBILITY: 
A PROGRAM OF WAR-TIME RESEARCH 


EDITED BY JOHN W. BLACK! 
Kenyon College 


THE ORIGIN AND NATURE OF THE STUDIES 


gine: following papers grow out of a 
unique program of war-time research 
related to speech training.? In 1943, at 
the request of the army, investigations 
were undertaken to provide a training 
program for increasing the effectiveness 
of interphone and radio communication 
concerned with aircraft. 

The aim of the training was solely 
functional—to improve voice efficiency, 
for the purpose of getting messages 
through to a listener. This aspect of 
speech was studied under further limit- 
ing circumstances, namely, in the pres- 
ence of high-level noise such as charac- 
terizes the interior of military airplanes, 
and with the voice in use over army in- 
tercommunication and radio sets. Lest 
the point be missed in individual papers 
it is emphasized here that the effective- 
ness of communication equipment is not 
guaranteed by engineering specifications, 
however stringent. The success repre- 


1 Director of the project; with the particular 
assistance of Gayland L. Draegert. Individual 
papers edited only to minimize repetitions of 
descriptions and explanations. 

2“This work was done in whole under Con- 
tract No. OEMsr-830 between The Psychologi- 
cal Corporation and the Office of Scientific Re- 
‘search and Development, which assumes no re- 
sponsibility for the accuracy of the statements 
contained herein.” ; 


sented in the best communication system 
is nothing if the operator, that is, the 
speaker, fails to handle the equipment 
and speak well. Failure or mediocrity 
of operating personnel in these respects 
accounted for the origin of the project 
and all of the papers of this series. 


The method of the researches was 
controlled experimentation. 


The bulk of the work was done at 
Voice Communication Laboratory, Waco 
Army Air Field, Texas, with service per- 
sonnel as experimental subjects. Some 
of the data were necessarily collected at 
other training centers for purposes of 
comparison. This work continued for 
two years, until mid-1945. 


Positively, the papers present a case 
history of a program of speech training 
for a single circumstance that included 
setting up or establishing in chronologi- 
cal order (1) a satisfactory method of 
measurement, (2) the relevancy of indi- 
vidual contents that might be viewed as 
pertinent to the instruction, and (3) 
curricula that conformed to the require- 
ments of the training situation and em- 
bodied the contents that had been 
proved to be of value. The results clear- 
ly show the advantages of applying ex- 
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perimental method to training programs 
that have limited or specific goals. The 
work also raises questions about the 
advisability of developing specific train- 
ing techniques to accompany industrial 
radio communications, rapidly growing 
in general use. 

Negatively, the papers contain little 
of immediate application to general 
speech training. Aesthetics of voice are 
ignored. Even the main topic, intelli- 
gibility, as treated here cannot be safely 
generalized so as to apply to under- 
standability in all telephony, much less 
public address. 

In presenting this summary of the re- 
sults of the program, a careful selection 
of topics has occurred, perhaps more 
than is evident. The details of experi- 
mental procedures are frequently omit- 
ted, experiment by experiment, but ap- 
pear in the series so that the reader 
can transpose them and recreate the cir- 
cumstance of any single experiment with 
fidelity. The purpose is to give an ac- 
count that exemplifies the main lines of 
the researches. Special researches, for ex- 
ample the treatment of communication 
in simulated high-altitude conditions, 
are omitted, not because of unimport- 
ance but because in experimental tech- 
nique they were but adaptations of the 
methods used for studving the general 
problem of intelligibility. Similarly, a 
large amount of field service work and 
operational research related to the train- 
ing units was highly important in the 
activity of the project but comes into the 
‘present reports only by implication. The 
reports fail to give an accurate impres- 
sion of the scope of the training and re- 
search programs either geographically 
or numerically. Such omissions are de- 
liberate. These papers are only an ac- 
counting to fellow workers in speech 
and psychology of the important aspects 
of the research and training programs 
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that evolved from fulfilling a highly 
specific request that, called for answers 
with almost whip-cracking suddenness. 
These aspects are primarily ones of gen- 
eral method. 


A- number of persons beyond the 
writers of papers for this series con- 
tributed to the results, Drs. Howard 
Gilkinson, Russell Lembke, C. Horton 
Talley, and Wilbur Moore worked in 
the laboratory for varying lengths of 
time. Also, men—some with considerable 
academic training and experience in 
speech or related subjects—were attached 
to the Laboratory as army personnel: 
Frank M. Lassman, Earl D. Schubert, 
William Crozier, George J. Fortune, 
Nelson R. Nail, Donald P. Veith, R. 
E. Waggoner, John H. Wiley, Scott H. 
Zahren, and Walter Powell. They are 
represented in this series only by Pro- 
fessor Stevens, who, as_ Lieutenant, 
headed the detachment. Two of the 
group, Nail and Powell, aided E. J. 
O’Brien, engineer, in the design and 
construction of training equipment and 
in servicing experimental apparatus. 
The others contributed in a general and 
also in an important way to the progress 
of the program. 


The organization of National Defense 
Research Committee projects, of which 
this was one, also provided advice and 
guidance to the program. The work was 
done under the auspices of the Applied 
Psychology Panel of NDRC with the 
Psychological Corporation as contractor. 
This arrangement brought the experi- 
ence in psychological experimentation 
of Drs. George K. Bennett, Walter Hun- 
ter, Dael Wolfle, and Charles Bray to 
bear upon the program. 


The arrangement provided the project 
with the further advantages of an army 
liaison officer, Dr. Don Lewis, whose 
experimental work in speech and psy- 
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chology has been considerable and whose 
advice tended to keep the program in 
line with immediate needs of the army 
and away from the diverting and equally 
stimulating leads that held theoretical 
interest for workers interested in voice 
science. 

In large measure the present writers 


are reporting the work for the group. 
No fair apportioning of credit is pos- 
sible. Some experiments were run simul- 
taneously with all hands contributing 
to ideas, experimental designs, execu- 
tion and interpretation. Deadlines and 
shifting priorities dictated an over-all 
plan of expediency. [J. W. B.] 
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principal problem in communi- 
cating in high-level noise such as in 
military airplanes is comprehending the 
message. The communication is through 
carbon microphones, an amplifier, and 
headsets. Casual observation shows that 
speakers in these circumstances differ in 
their efficiency or their capacity to get 
messages understoood. An _ objective 
statement of the relative intelligibility 
of different speakers, however, depends 
upon systematic and reliable measure- 
ment. A reliable testing technique, then, 
was the first requisite of a research pro- 
gram concerned with voice intelligibili- 
tv in aircraft. Moreover, to accommo- 
date a hurry-up routine and class sched- 
ules of Army Air Forces training, the 
measuring device not only had to give 
a good answer, but be brief and, in a 
sense, a group test. 


A background for measuring intelli- 
gibility lay in the articulation tests 
that were used by Dr. Harvey Fletcher 
twenty years ago—tests that were not 
followed up in the field of speech.* 
Perhaps this was so because Fletcher was 
not concerned with the speaker but with 
phonetic elements, and later with the 
practical problem of evaluating tele- 
phonic equipment. Understandability 
was measured by speakers reading 
sounds, syllables, words, and sentences 
to listeners whose responses indicated 


comprehending or not comprehending © 


the message. 


Following somewhat closely the work 
of Fletcher, the Harvard Psycho-Acous- 
tic Laboratory? recently continued the 


1 Fletcher, Harvey, Speech and Hearing (New 
York, 1929). 

2 Psycho-Acoustic Laboratory (Harvard Uni- 
versity, Cambridge, Massachusetts). 
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development of articulation tests, par- 
ticularly of the word and sentence types. 
This work, just before and during 
World War II, was especially important 
in stressing equivalence among different 
test forms. This was in keeping with 
the primary uses of the measures, name- 
ly, to differentiate among pieces of com- 
munication equipment in terms of effi- 
ciency, and to measure psychological 
phenomena among listeners who were 
subjected to an environment of high- 
level noise. Probably as a combination 
result of a priori reasoning, face validity 
and the limited number of speaker-lis- 
tener responses that were available for 
standardization of item values, some 


.care was taken to equate the phonetic 


elements among the test forms. The tests 
were lengthy, sufficiently so that a few 
speakers, for example, three, could read 
one test each in each of two or more 
experimental conditions and the pro- 
portion of correct responses made by a 
group of practiced listeners provided re- 
liable statements for comparisons. 


The Voice Communication Labora- 
tory faced some circumstances unlike 
those encountered by either Fletcher or 
the Psycho-Acoustic Laboratory. First, 
the job of developing a training pro- 
gram necessitated testing the speaking 
intelligibility of students (not equip- 
ment or sounds) in order to determine 
the effectiveness of voice training. Sec- 
ond, no constant or practiced listening 
group was available. Third, because of 
the large flow of students or experiment- 
al subjects and the limited time .avail- 
able, there was a premium on short tests. 
Fourth, this same flow of students pro- 
vided a larger sample of speakers and 
listeners than had favored other investi- 
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gators in the field of intelligibility or 
articulation testing. 


The project workers looked hopefully 
at possible new approaches to intelligi- 
bility testing, for example, ones that 
might relate speaker effectiveness with 
the airman’s job performance, or a nega- 
tive score to indicate how much and 
what kind of selective amplification of 
the voice would make an_ individual 
speaker more intelligible. However, 
such types of testing, if practical, would 
have been costly in time of construction. 
The pressure of immediate need led to 
adapting the most promising of the 
techniques already in use, a_ pattern 
built around word intelligibility. 

The testing process that was devel- 
oped operated as follows. From 8-12 
students were stationed on a party-line 
network over which they could talk and 
listen to each other through airplane 
intercommunication equipment. High- 
level, airplane-type noise filled the test- 
ing room.® Each student read a list of 
words in the manner, “Number one is 
fog. Number two is dashboard,” and the 
remaining persons on the line wrote 
what they heard the test words to be. 
The introductory phrase tended to make 
the reading of the words sufficiently slow 
for the listeners to keep up. Occasion- 
ally the monitor had to request that the 
reading be slower. The members of the 
party line rotated as speakers, cach us- 


3 The noise was controlled in intensity, 108- 
110 db, and in spectrum (Spectrum I, Psycho- 


. Acoustic Laboratory). 


This .was probably a good approximation to 
a median noise level among military aircraft. 
Noise levels as high as 125-128 decibels have 
been measured in the most noisy planes. In the 
least noisy planes, noise levels were of the order 
of 85-90 decibels. 

The noise spectrum was typical of that in 
bomber type aircraft. The frequencies below 
1000 ¢.p.s. were of highest intensity with the 
curve showing a gradually increasing negative 
slope for the frequencies above 1000 c.p.s. 

Noise was electrically generated and distrib- 
uted throughout the room by means of properly 
placed speaker-reproducers. 


5 


ing a different test list. The speaker's 


intelligibility score was the proportion 


of correct responses made by all of the 
listeners. Four such party lines or cir- 
cuits operated in the same room simul- 
taneously. 


Obviously, the first crucial step in the 
development of this procedure was the 
selection of test items, and the second, 
the combining of words into tests of 
equal difficulty. The following criteria 
were followed as closely as was practic- 
able in the selection of words: 

1. Use of one and two-syllable words. 

2 Use of words with Thorndike ratings of 10 
or less.4 

3. Use of words that in trial tests were pro- 
nounced correctly at least go per cent of the 
time. 

j. Use of words that on trial tests were be- 

tween 20 and 80 per cent intelligible. (The 

intelligibility value of a word is the pro- 

portion of times that it is heard correctly 

when spoken by many speakers.) 

Avoidance of homonyms. 


6. Avoidance of words with alternative stress 
patterns. 


The final tests were drawn from 5,000 
words that were used in trials. 


In combining the words into test 
lists the items were assigned to make the 
lists equal in mean intelligibility values 
and approximately so in standard devia- 
tion. Repetitions of initial sounds and 
suffixes within a list were generally 
avoided, although this limitation was 
viewed as inconsequential. 

In the main, the experimental work 
was conducted with 24-item word lists, 
of which 48 equivalent forms were con- 
structed. Later, for use in training 
classes these items were refined and re- 
grouped into 24 twelve-word lists. Mini- 
mum standardization for all of the tests 
was based on 50 speakers with 450 lis- 
teners, all from the population for which 
the tests were constructed. 


4 Thorndike, E. L., Thorndike Century Senior 
Dictionary (New York, 1941). 
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A third critical stage in the develop- 
ment was the use of the speakers as 
listeners in a rotation manner. Other 
researchers have used constant panels of 
listeners. The employment. of a single 
speaker-listener panel of 8-12 members, 
however, turned out satisfactorily. The 
predicted reliability for different num- 
bers of listeners is as follows (Spearman- 
Brown) : 


N Listeners r N Listeners SS 
6 76 12 36 
79 16 89 
8 81 20 gl 


In actual measurements, panels of four lis- 
teners correlated, r = .68, and seven listeners 
83. 


Revisions and extensions of the word 
tests continued throughout the duration 
of the project and culminated in a set 
of multiple-choice tests in which the 
listener choices represented the four 
most common error-substitutions on the 
part of listeners who took the write- 
down tests. However, since the experi- 
mental results in the accompanying 
papers were derived from write-down 
tests, the other methods of measurement 
—including the multiple choice tests— 
will not be discussed. Moreover, atten- 
tion is focused on the principal set of 
write-down tests. Others were con- 
structed for special purposes, for exam- 
ple, for use with the throat microphone. 


The following indexes relate to the 
adequacy of the tests. First, the mean 
intelligibility score for a sample of 169 
untrained speakers under laboratory 
conditions was 50.0, S.D., 12.0. Second, 
split-half correlations, corrected for 
length, of the measures of individual 
speakers were .86 to .g4. Third, relative 
intelligibility values of test items as 
determined at different training centers 
correlated .86 to .g2. Fourth, no signifi- 


cant differences were found (F-tests) be- 
tween the means of word lists, F = 0.26 
when 1.61 = 5 percent level of confi- 
dence. (An assumption in an analysis 
of variance is that the variation in all 
sub-divisions is the same. “L” was com- 
puted, and it showed no lack of uniform 
variance among the tests.) Fifth, con- 
sistent differentiation was shown be- 
tween trained and untrained classes. 


Although reference is made above to 
the reliability of the measures for a 
single speaker, in practice a condition 
was always represented by a composite 
score for a large group, with usually 24-° 
48 members. In the gross, this meant 
that each experimental condition was 
represented by measures from at least 
576 test items as spoken by 24 subjects. 


These tests served to measure the 
average effectiveness of microphone posi- 
tions, voice variables, and training tech- 
niques.® They also proved to be motiva- 
ting when used in training and were 
included in one form or another at the 
outset and conclusion of all voice com- 
munication courses. 


By way of exemplification, one short 
word list follows: fog, dashboard, cold, 
flight, headwind, roll, missile, course, 
binding, practice, socket, impulse. 


It is observed that the standardization 
of these tests provided the largest num- 
ber of speaker-listener responses to in- 
dividual words that has occurred to our 


5 The relationships between voice factors and 
intelligibility are treated in the following pa- 
pers. The reader will note that there is no 
development of rate of speaking as related to 
intelligibility. The topic was inadequately in- 
vestigated. Within broad limits slow and fast 
speaking were found to be detrimental to in- 
telligibility, but no important differences were 
established for rates between 100 and 180 words 
per minute. Contributing to the failure to in- 
vestigate rate more thoroughly was the inap- 
plicability of the word test for measurements 
of intelligibility of contextual speaking, and 
the lack of discrimination afforded by sentence 
tests that under the conditions yielded a mean 
score of about 80%. [Ed.] 
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knowledge. This scope of study verified 
indications that for a single condition 
the relative intelligibility of words is 
markedly stable from person to person 
and, for one person, from time to time. 


7 


Over and above this, however, is the 
implication that intelligibility testing 
by one method or another may well be 
exploited in speech researches and in 
speech courses. 
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INTELLIGIBILITY RELATED TO MICROPHONE POSITION 


JAMES F. CURTIS 
State University of Iowa 


NE of the most obvious determiners 

of intelligibility of speech over elec- 
trical communication systems in a noisy 
environment, as in military aircraft, is 
the positioning of the microphone. All 
personnel concerned with the problems 
of voice communication in Army Air 
Forces agreed to its importance but not 
about what constituted the optimum 
positions, of which many, including some 
too bizarre to merit serious considera- 
tion, were advocated. Accordingly, the 
following series of experiments was un- 
dertaken. 

Three classes of microphones—all car- 
bon—were in general use by AAF during 
World War II, hand-held, throat, and 
a special insert-type for fitting into the 
oxygen mask. Since only the first two 
could be varied in the positions in 
which they were held or worn, the in- 
vestigations concerned only the hand- 
-held and throat microphones. The ex- 
periments were similar in purpose and 
design. In each the purpose was to dis- 
cover which method of holding or wear- 
ing (in the case of the throat micro- 
phone) the microphone would result 
in the maximum intelligibility of speech, 
considering all speakers. All experi- 
ments employed equipment commonly 


used by AAF personnel and the tests 
were conducted in 108-110 db aircraft- 
type noise. The criterion of merit for 
all comparisons was the average score on 
word intelligibility tests. In each in- 
stance, all positions were tested with 
the same group of speakers, each speaker 
in a single sitting reading a test list 
with each microphone position. How- 
ever, the serial order of the microphone 
positions was rotated from speaker to 
speaker in such manner that, although 
the particular order used by any speak- 
er was a chance matter, over the whole 
experiment the order conditions were 
counterbalanced. It was thus possible to 
minimize sampling errors and to assure 
that no position would be systematically 
given any advantage because of an order 
effect. 

Student pilots in their first four weeks 
of basic pilot training served as speak- 
ers and listeners. 

The first experiment was concerned 
with a comparison of six positions for 
the hand-held microphone. The ar- 
rangement of equipment is diagrammed 
in Figure 1 which shows a standard in- 
(hand-held micro- 
phone, interphone amplifier, and ear- 
phone circuit) vacuum-tube 


terphone circuit 
with a 


22 Pairs Earphones 


Interpho 
Amplifier 


To Hendheld 
Microphone 


Vacuum Tube 


Voltmeter 


FIGURE 1.—SCHEMATIC DIAGRAM OF EQUIPMENT SET-UP FOR First HAND-HELD MICROPHONE 
PosITION EXPERIMENT 
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voltmeter connected in parallel with the 
earphones to measure intensity levels of 
speech signals and noise pick-up of the 
microphone. 

Each of the six microphone positions 
was tested with a group of 16 speakers. 
The positions were the ones most com- 
monly recommended by flying person- 
nel. For the zero-distance position the 
microphone was placed so that the 
speaker talked directly into it, the mi- 
crophone being sufficiently close to the 
speaker’s mouth to touch the lips light- 
ly, but not to interfere with articulatory 
movements. The one-half inch distance 
position differed only in that the micro- 
phone was positioned away from the 
speaker’s lips. Similarly, the one-inch 
distance position differed only in the 
greater distance of the microphone from 
the speaker’s mouth. For the 30-degree 
position the plane of the microphone 
was rotated on a vertical axis away 
from the speaker's mouth through 30 
degrees of arc. The closer edge of the 
microphone remained close enough to 
just touch the speaker's lips. The 45- 
degree position differed from the 30- 
degree position only with respect to the 
size of the angle formed by the micro- 
phone and the speaker’s face. A sixth 
position was described as the thumb- 
encircling position. For it the speaker 
endeavored to shield the microphone 


TABLE Scorers AND VALUFS FOR SprFCH PEAKS, Noise Pick-Up oF MIcRo- 
PHONE, AND SPEECH-TO-NotseE RATIO FOR Stx MICROPHONE Postrions. N — 16. 


MICROPHONE POSITION 9 


from the room noise by emcircling the 
rim of the microphone with his thumb, 
then held the microphone as close as 
possible to the lips while maintaining 
this position. 

Except for the thumb-encircling posi- 
tion, which necessitated that the speaker 
himself hold the microphone, an ad- 
justable microphone holder and head- 
rest was employed to insure maintenance 
of the required position. 

No restrictions were imposed on the 
speakers with respect to the loudness 
level of the voice to be employed, since 
it was reasoned that there might be 
natural adjustments in voice level, from 
position to position, which ought to be 
considered associated with the positions. 
This part of the design probably ac- 
counted for the anomalous results ob- 
tained for one comparison. Readings 
were made of the peak intensity ac- 
companying each test word and the noise 
pick-up of the microphone during the 
intervals between words as indicated by 
the vacuum-tube voltmeter.’ 

1 The recorded voltages for individual speech 
peaks were subject to errors in reading the 
meter by the observer; however, when indica- 
tions were averaged for a relatively large num- 
ber of test words, it was found that sufficiently 


stable values could be obtained for compari- 
sons with other values obtained in the same 


manner. 

On disyllabic words the meter deflection for 
the accented syllable was recorded for each such 
word. 


Mean Average 

Microphone Intelligibility Speech Signal Noise Pick-up S/N3 

Position Score Intensity of Microphone? Ratio 
Zero-distance 43-0 12.8 0.4 12.4 
¥4-inch 36.1 10.9 —0.1 11.0 
1-inch 31.4 9.6 0.0 9.6 
Thumb-encircling 16.4 14.7 4-9 98 
go-degree angle 38.7 12.2 08 11.4 
45-degree angle 39-4 11.4 —0.1 11.5 


2 The values for Speech Peaks and Noise Pick-up of the Microphone are db relative to one 


volt. Conversions to decibels were made from geometric means of voltage indications since 
intensity in decibels is a logarithmic function of voltage. 
% Speech-to-Noise ratio is the difference in intensities computed in decibels. It is obtained 


by subtracting a value in column four from the corresponding value in column three. 
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Examination of Table 1 shows that 
intelligibility scores, average speech- 
signal intensities, and speech-to-noise 
ratios all decrease as the distance of the 
microphone from the speaker’s lips is 
increased and, further, that changing 
the distance by an amount as small as 
one-half inch produces a marked effect. 
Statistical analysis showed these differen- 
ces to be highly significant. Changing 
the angle of the microphone so that the 
speaker talked less directly into it also 
lowered the intelligibility score, although 
less markedly. 

The comparison between the thumb- 
encircling position and the zero-distance 
position was not so conclusive, the dif- 
ference not being statistically significant. 


Morevover, the relation between the 
intelligibility scores and the speech-to- 
noise ratios of these two _ positions 
seemed anomalous, since the thumb-en- 
circling position showed a less favorable 
speech-to-noise ratio, although its in- 
telligibility score was somewhat higher. 
In addition, speakers seemed to be much 
more variable with respect to the success 
with which they were able to use the 
thumb-encircling position. Further ex- 
perimentation with these two positions 
was planned. 

In the second experiment, a compari- 
son between zero-distance and thumb- 
encircling positions, some changes in 
method were made. The most impor- 
tant was to control the loudness of 


vw 
Meter 
Main Listeners! Circuit 
12 
To Test | pairs Earphones 
Auxiliary Listeners! Circuit 
10_ pairs Earphones 
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for both earphone circuits 
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To Monitor Interphone 
Experimenter's Earphones 
| Speaker's Earphones 
vu 
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FIGURE 2.—SCHEMATIC DIAGRAM OF EQUIPMENT SET-UP FOR SECOND HAND-HELD MICROPHONE 
Posit1ON EXPERIMENT 
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speaking, keeping it as nearly constant 
as possible for the two positions. In ad- 
dition to the regular microphone-am- 
plifier-earphone circuit previously used, 
a separate monitoring circuit employing 
a throat microphone was included (Fig- 
ure 2). This monitoring circuit provided 
the means for controlling loudness since 
the intensity of signal picked up by it 
was independent of the position of the 
hand-held microphone. Speakers were 
instructed to attempt to speak each word 
with a loudness that would produce an 
assigned deflection of the VU meter of 
the monitoring circuit. 

Other minor equipment changes in- 


cluded use of a newer type, non-resonant | 


earphone.* Because these earphones 
were of considerably lower impedance 
it was necessary to use a booster ampli- 
fier (high impedance input) to feed part 
of the earphones in the listening circuit 
if the ordinary operating voltage ratio 
in the microphone and earphone cir- 
cuits was to be maintained.® 

Table IT presents the results of the 
experiment in which 24 speakers read 


position, the mean difference in intellig- 
ibility scores (8.9 points) being signif- 
icant at the one per cent level of confi- 
dence. Second, it should be noted that 
the data on noise pick-up of the micro- 
phone for the two positions are at var- 
iance with those recorded earlier in 
which substantially greater noise pick- 
up accompanied the thumb-encircling 
position. It was thought probable that, 
in the earlier experiment, some of the 
subjects had failed to hold the micro- 
phone closely enough to the mouth to 
provide an effective noise shield. Added 
precautions were therefore taken to 
instruct the subjects carefully and insure 
that this was done. However, speakers 
still experienced varying degrees of suc- 
cess in shielding, and with some the 
noise pick-up with this position was 
higher than with the zero-distance posi- 
tion. The average for all speakers was 
very nearly the same for the two posi- 
tions. 

The data were broken down further 
to investigate possible interaction be- 
tween the differential success in shield- 


TABLE II.—Comparison BETWEEN Two METHOps oF HoLDING THE HAND-HELD MICROPHONF. 


N = 24. 
Position 

Condition Zero-Distance Thumb-Encircling 
Mean Signal Level in VU6 —1.6 —1.4 
Mean Range of Noise Pick-up in VU —10.g to —15.0 —11.0 to —16.5 
Mean Intelligibility Score 60.4 51-5 

Mean Difference in Intelligibility Score 8.9 

3-94 

Significance Level 


test lists at each position, zero-distance 
and thumb-encircling. First, a clear 
superiority is shown for the zero-distance 


4 The older type earphones used in the first 
experiment had a badly peaked frequency re- 
sponse characteristic. The newer type, attaining 
widespread distribution at the time this experi- 
ment was conducted, had a response curve which 
was substantially flat from 100 c.p.s. to 4000 
c.p.s. 

Twelve pairs of earphones were the most 
that would ordinarily be connected to a single 
amplifier. 

6 VU relative too VU = 4.2 volts. 


ing the microphone from the ambient 
noise and the corresponding intellig- 
ibility scores. Table III shows that both 
groups, irrespective of their success in 
shielding the microphone, made higher 
average intelligibility scores with the 
zero-distance position. 


The last experiment in the series 
concerning microphone placement dealt 
with the problem of the optimum posi- 
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TABLE III—Comparison oF Two Groups oF Susjects Divipep As TO SUCCESS IN SHIELDING 
NOISE FROM THE HAND-HELD MICROPHONE WITH THE THUMB-ENCIRCLING POSITION. 


Condition 


Subjects with Low 
Noise Level for 
Thumb-encircling 


Subjects with High 
Noise Level for 
Thumb-encircling 


Mean Range of Noise Pick-up in VU 
Mean Intelligibility Score 
Thumb-encircling Position 
Zero-Distance Position 
Difference 


Position Position 
—15.5 to, —20.6 to —13.5, 
50.3 46.8 
62.1 57-1 
11.8 10.3 


tion for wearing the throat microphone. 
This unit consists of two small micro- 
phone buttons built into a strap which 
may be fastened around the neck so that 
the buttons rest on either side of the 
forward part of the neck in the region 
of the larynx. 


The equipment for this experiment 
was exactly like that of the one just 
described except that the hand-held 
microphone was used for monitoring, 
the throat microphone being the instru- 
ment under test. Three positions were 
tested with each of 23 speakers. The 
center position was achieved by placing 
the microphone as nearly as possible at 
the level of the prominence of the lar- 
ynx, with the two buttons of the micro- 
phone resting on the side plates of the 
thyroid cartilage. For the low position 
the experimenter placed his finger across 
the front of the subject’s larynx at a 


point corresponding to the center posi- 
tion and adjusted the microphone into 
position below and touching the finger. 
The high position was the same, except 
the microphone was placed above and 
touching the finger. Microphone _posi- 
tions were easily maintained if the strap 
about the speaker’s neck was placed to 
correspond to the microphone position. 


The results of this experiment, pre- 
sented in Table IV, show that the low 
position is definitely inferior. ‘The inten- 
sity of speech as heard by the listeners 
is appreciably less and the intelligibility 
scores for speakers using this position 
are much lower. The difference between 
the high and center positions was small 
and not statistically significant. 

These findings led to recommended 
practices embodied in course curricula 
and set standard practices for subsequent 
researches. 


TABLE IV.—INTELLIGiBiLity ScorE DATA AND SIGNAL LEVELS FOR THREE PosITIONS OF THE 


THROAT MIcropHONE. N = 23. 
a. Measurements 
Position of Microphone 
Condition Low Center High 
Mean Intelligibility Score’ 17.3 
Mean Signal Level in VU, Throat Microphone —6.2 —2.9 —2.2 
Mean Signal Level in VU, Monitoring Circuit —o0.6 —o0.6 —0.4 


b. Statistical Analysis of Intelligibility Score Data 


Comparison Difference S. E. Diff. 
Low-Center 16.1 2.7 
Low-High 18.2 3.3 551 

Center-High 2.1 2.5 34 
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INTELLIGIBILITY RELATED TO LOUDNESS 


PAUL MOORE 
Northwestern University 


S with other aspects of voice, indivi- 
dual air-crewmen differed in their 


estimates of optimum loudness for best , 


communication. Various suggestions 
were made in the Army Air Forces train- 
ing manuals, but they reflected an in- 
complete understanding of the funda- 
mental problem—the masking of the 
voice by the intense noises in the aircraft. 
Typical advice was to use a conversa- 
tional voice, or to speak in a normal 
or customary manner. At the same time 
communication failures were reported 
which were due to faulty use of com- 
munication equipment. Use of proper 
loudness by the radio and interphone 
talker was clearly indicated. 

Among the several investigations 
planned to discover the best way to talk 
over airborne communication systems, 
the ones treating with loudness of voice 
were given first place. These experiments 
were set up to answer the following 
qustions: (1) When talking from one 
position to another in a bomber, how 
loudly should one speak when using (a) 
a hand-held microphone, (b) a throat 
microphone, (c) an oxygen mask with 


insert-type microphone? (2) Do the 


results differ with the type of headset 
used by the listener? (3) How do the 
several kinds of earphone cushions 
influence intelligibility? (4) Is the opti- 
mum loudness over the interphone also 
best for the several most common radios? 

The first experiments treated with 
talking over the interphone. ‘Twenty- 
four student pilots read intelligibility 
tests in the conventional manner, that 
is, to groups of their associates over air- 
craft communication equipment in a 
room filled with 108-110 db of airplane- 
type noise. In each study the speakers 


used four degrees of loudness ranging 
from conversational to shouting, or more 
precisely, the levels were separated by 
approximately three db. The equivalent 
voltages across the listeners’ headphones 
for the four degrees of loudness, when 
the high-impedance resonant earphones 
that were standard equipment in 1943 
were used, averaged 7.0, 10.0, 14.0 and 
20.0; with later model, low-impedance, 
non-resonant earphones the equivalent 
voltages were 2.1, 3.0, 4.2, and 6.0. Each 
man had a period of training on his 
test list at the specified loudness that 
he was expected to use in his subsequent 
reading. He regulated the level cf his 
voice both during practice and in the 
final reading by watching the deflection 
of either a vacuum-tube voltmeter or 
VU meter connected in the output cir- 
cuit. As each test word was spoken in the 
experimental conditions, the monitor 
recorded the actual meter deflection. 

In working with the different ear- 
phones and earphone-cushion combina- 
tions and with the hand-held and throat 
microphones the speakers used simul- 
taneously a hand-held microphone 
placed at the lips and a throat micro- 
phone fastened at the level of the Adam's 
apple. Each microphone was connected 
to an interphone amplifier which fed a 
circuit of twelve pairs of earphones. 
With this arrangement, each word 
called by a speaker produced a signal 
in both circuits, and data were obtained 
for the two microphones simultaneously. 
The results of these experiments are 
presented in Table 1 and Figure 1. 


In Figure 1 the per cent of intelligibil- 
ity is plotted against relative intensity 
in db. Each set of curves presents data 
for both microphones used with a single 


oO 
d 
° 
P 
Oo 
v 
4" 
n 
n 
1 
a 
_| 
= 
j 
if 
1 


14 SPEECH MONOGRAPHS 


TABLE Scores FoR Four Levers oF LoupNngess. N = 24. HIGH-IMPEDENCE 
EARPHONES IN SPONGE-RUBBER Cups (MC-162 CusHIons). 


See first pair of curves, Figure 1. 


Hand-Held (T-17) Microphone 


Mean Voltage _ Relative Per cent 

Across Intensity Intelligi- 

Earphones Db gibility 
7-3 36.9 
10.3 0.3 43-7 
14.6 3-3 48.3 
19.9 6.0 50.0 


Throat Microphone 


Mean Voltage Relative Per cent 
Across Intensity Intelli- 
Earphones Db gibility 

7.2 —2.81 18.2 

13.1 . 2.4 31-4 

17.2 4-7 33.0 

21.1 6.5 30.0 


High-impedance Earphones in Smooth Rubber-seal (MC-162-A Cushions). See second 


pair of curves, Figure 1. 


7-4 —2.61 48.2 
10.6 0.5 56.4 
14.3 3-1 58.2 
19.5 5.8 56.0 
Low-impedance Earphones in Smooth Rubber-seal Cushions. 
Figure 1. 
2.1 —3.02 52.1 
3-2 0.6 58.6 
4-2 3.0 61.0 
5-6 5-4 61.3 


8.5 —1.41 


12.1 37-7 
16.2 4-2 37-9 
21.0 6.4 31.8 


See third pair of curves, 


45-3 


2.4 —1.9 

3-5 1.3 47-6 
43 3-1 49-5 
5-6 5-4 40.6 


1 Db relative to 10 volts. 
2 Db relative to 3 volts. 


headset-earphone cushion combination. 
It will be observed that the hand-held 
microphone provides higher intelligibil- 
ity than the throat microphone; also, 
that the higher levels of loudness are 
superior to the lower levels. The third 
level in each instance represents a loud 
voice, the fourth, shouting. In all in- 
stances intelligibility increases with 
loudness up to and including the third 
level. 

The statistical signifiance of differ- 
ences between the four loudness levels 
was tested by a separate analysis of var- 
iance for each equipment combination 
(F-test). The results appear in Table 
II. An F which was significant at or be- 
yond the one per cent level of confidence 
was obtained for each equipment combi- 
nation with one exception: the throat 
microphone with the high-impedance 
headsets in the smooth rubber-seal ear- 
phone cushions (significant substantially 
above the five per cent level of confi- 
dence.) Clearly, the differences in intel- 


ligibility scores between loudness levels 
for each equipment combination are 
larger than can be accounted for by 
chance. 

Table II also indicates the differences 
necessary between any two loudnesses 
with a single equipment combination 
to be considered significant. This test 
shows that some of these differences, 
that is, between some successive points 
on Figure 1, are not significant; however, 
some confidence beyond the indications 
of this test can be placed in the differ- 
ences because of the consistent trend of 
the curves. 

Curves for the oxygen-mask micro- 
phone were not included because the 
differences in intelligibility scores for 
the four loudness levels used were not 
Statistically significant, although the 
curve tended to follow the others. The 
weakest level of speaking, conversa- 
tional, was as intelligible as the louder 
ones, within: the errors of chance. 

An experiment to determine optimum 
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TABLE Il.—Sicniricance or DirrereNnces (Based on Table 1). 


Differences between 
Intelligibility Scores 
Required for Significance: 


Conditions F 1% Level 5% Level 
A. High-impedance earphones with sponge 
rubber cushions (MC-162) 
Hand-held microphone 19.56 4.8 3.7 
Throat microphone 15.54 6.2 4-7 
B. Same with smooth rubber-seal cushions 
(MC-162-A) 
Hand-held microphone 10.80 4-9 3-7 
Throat microphone 3-45 5.6 43 
C. Low-impedance with same cushions 
Hand-held microphone 7.58 5-7 43 
Throat microphone 4-33 6.7 5-1 
F (1%) 4.08 
F (5%) 2.74 
Dummy Antenna 
T-17 Transmitter 
Microphone 
Speaker’s Heads = 
Experimenter’s Vacuum- 
Headset Tube 
Voltmeter 
Booster | -< 
Receiver Amplifier } 
VU 
Meter 
VU 
Meter 


FiGURE 2.—SCHEMATIC DIAGRAM OF EQUIPMENT SET-UP FOR EXPERIMENTS ON RELATION BETWEEN 
RELATIVE INTELLIGIBILITY AND LoupNess LeveL or Voice with THRee AAF AIRCRAFT Rapio SEtTs 
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INTELLIGIBILITY 


loudness for three standard aircraft radio 
sets was conducted somewhat similarly 
to the interphone studies reported above, 
that is, men in pilot training read in- 
telligibility tests to groups of their 
fellows. However, the use of radio sets 
necessitated a more elaborate equipment 
arrangement with the exception of head- 
sets, microphones, and earphone cush- 
ions. 

The arrangement of the equipment 
is diagrammed in Figure 2. The monitor 
arrangement is similar to that described 
above. Comparative frequency response 
measurements on the headset circuits 
indicated that throughout the range of 
important frequencies, 100-4000 C.p.s., 
the transmitter side tone matched rea- 
sonably well that of an interphone cir- 
cuit. The side tone, that is the signal 
heard by the speaker, was, therefore, 
similar in both level and spectrum to 
that heard by speakers over an inter- 
phone network. The important differ- 
ence between the performances of the 
two setups was that the receiver output 
had marked variations frequency 


TABLE IIL—Comparison oF INTELLIGIBILITY 


AND LOUDNESS 17 


response. This, however, could not affect 
the speaker who heard only the side 
tone, nor the comparability of listener 
data from these experiments with those 
from interphone equipment. 

The data for loudness levels and cor- 
responding intelligibility scores for three 
aircraft radio sets are presented in Table 
III and Figure 3. 

It will be noted that for all three of 
the radio sets the intelligibility scores 
increase with the increasing loudness of 
voice through the first three levels as 
they did in the previously described 
studies of microphones, earphones, and 
earphone cushions used on an inter- 
phone network. On the fourth, or high- 
est level, a decrease in intelligibility is 
found with both radio A and radio C. 
With radio B the intelligibility score 
increased even at the loudest level. 

The statistical significance of differ- 
ences in intelligibility scores found for 
the loudness levels, as shown by F-tests, 
indicate that with all three radio sets 
the over-all differences are significant 
beyond the 5°, level of confidence and, 


Scores FoR Four Loupness Levets with THREE 


AIRCRAFT RApIo Sets. N = 24. 
Listener Condition 
Speaker Condition Radio A‘* Radio B Radio C 

Level 1 

Mean Signal Level, db? 1.4 —18 —2.5 

Mean Intelligibility Score 37:8 59-5 
Level 2 

Mean Signal Level, db% 11 0.5 0.6 

Mean Intelligibility Score 13-4 60.8 65.7 
Level 3 

Mean Signal Level, db3 3.7 3.0 3.1 

Mean Intelligibility Score 14-3 65.8 68.9 
Level 4 

Mean Signal Level, db 6.0 6.0 5.8 

Mean Intelligibility Score 13-6 69.3 66.3 
Significance of Differences in Intelligibility 

Scores, F 4.86 19.90 3-24 


(1% = 4.96) (1% = 4.08) (1% = 4.08) 


% Expressed in db relative to 3.0 volts, converted from geometric means of peak voltmeter de- 
flections for words. Voltages were measured at speaker's headset but were identical to peak 


voltages at listeners’ headsets. 
4 Data from only 23 speakers. 
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in the case of radio B, highly significant. 
For the three radios, a loudness ap- 
proximating the third level is substan- 
tially better than one that produced 
weaker signals to the listener’s ear. This 
is a strong voice, considerably louder 
than ordinarily used in conversation but 
not so loud as to require the extreme 
effort of shouting. 

These data, obtained from experimen- 
tal tests of radio equipment, agree with 
these from experiments over interphone 
equipmem. The conclusion is therefore 
warranted that the radio link makes 
little, if any, difference with respect to 
the relationsh?p between loudness of 


75 
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Mean Signal Level, db 


FiGuRE 3.—COMPARISON OF INTELLIGIBILITY ScoREs FOR Four LoupNess LEVELS WITH 
THREE STANDARD AAF Arrcrart Rapio SEts 


voice and intelligibility. There is no 
important over-modulation of the air- 
craft transmitter by loud talking, and 
results obtained from interphone experi- 
ments may safely be generalized to apply 
to aircraft radio. 

These studies indicate that for ope:- 
ating both aircraft radios and _inter- 
phones, loudness of voice is an impor- 
tant factor. With all microphones used 
by the AAF, except the one fitted in the 
oxygen mask, and for either radio or 
interphone, the speaker’s voice should 
be loud. For training purposes this can 
be effectively described as just under 
shouting. 
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THE EFFECT OF VERY LOUD SPEECH SIGNALS 
UPON INTELLIGIBILITY 


HARRY M. MASON 
Purdue University 


the best-validated content 
in voice-communication training 
courses is the instruction to speak loudly, 
as loudly as possible without excessive 
strain or shouting. This led Labortaory 
personnel to the opinion that very high 
signal levels could be used, regardless of 
other conditions, without substantial de- 
triment to intelligibility. Experimental 
results in the use of noise-shielded 
microphones suggested, however, that 
very high voice levels do not increase 
intelligibility, and other experiments 
in which rubber plugs were placed in 
the outer ears of listeners indicated that 
improvement in intelligibility may be 
associated with decreased signal levels, 
provided the interfering noise is atten- 
uated as much as the speech signal. 


This paper summarizes experimental 
results which suggest that loud signals 
are not always more intelligible than 
less intense ones, and presents the de- 
tails of an experiment in which the 
effect of loudness of signal, independent 
of the ratio of signal strength to noise 
strength, was given a critical test. 

Of the three carbon microphones used 
in communication, the hand-held picked 
up a great deal of noise and was also 
quite sensitive to speech; the throat 
microphone was somewhat less sensitive 
to both speech and noise than the hand- 


held type; and the oxygen-mask micro- 
phone was well shielded from surround- 
ing noise. With this last type it was 
possible to produce a higher ratio of 
speech-signal strength to noise strength 
than with the other types. (Relative 
speech-to-noise ratios for the three 
microphones appear in a_ preceding 
paper.) 

Table I summarizes the degree to 
which the three microphone-types pro- 
duced increased intelligibility with in- 
creased loudness of voice. It may be 
noted that there is no significant in- 
crease in intelligibility with loudness for 
the mask microphone, though with each 
of the other types intelligibility increases 
up to the high levels. Experimenters at 
first attributed the lowered intelligibil- 


. ities at the shouted levels to voice distor- 


tion or to “overloading” of the amplifier 
system. Related researches in other lab- 
oratories indicated that slight overload- 
ing of similar amplifiers could be ex- 
pected to increase intelligibility, the 
overloading giving an effect similar to a 
noise limiter, and this, in turn, was 
shown to increase perception of voice 
signals. 

The net effect of the material in Table 
I is to suggest that where noise entering 
the microphone is reduced to a low level, 
there is no advantage to ve.y high speech 


TABLE 1I.—HeEapset VouTAGE RELATED 1O INTELLIGIBILITY: THREE MICROPHONES. 


Mean Intelligi- 


Average Mean Intelligi- Mean Intelligi- 
Voice Headset bility Mask bility Hand-Held bility Throat 
Level ~ Voltage Microphone (T-17-B) Microphone Microphone 
Conversational 2 73.0 52.8 45-3 
Loud 8 73-3 68.4 476 
Very Loud 4 73-6 71-4 49-5 
Shouting 6 71.2 67.2 40.6 
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levels. The possibilities remain that the 
noise-shielding mask muffled high- 
intensity speech and thus cancelled the 
normal advantage of loudness, or that 
the microphone was not able to carry 
the high-input levels adequately. 

Further evidence that intelligibility 
does not always improve with very high 
voice levels is presented in Table II. 
These data were collected in a mobile 
training unit in which flying instructors 
at primary training fields were instructed 
in the use of electrical interphone equip- 
ment employing the same microphone 
as the one referred to here as oxygen- 
mask type. The microphone was en- 
closed in an effective rubber noise shield 
strapped across the face in such a man- 
ner that the microphone was about one 
inch in front of the mouth. Earphones 
identical with those used in the experi- 
ments reported in Table I were used. 
Table II shows no substantial increase 
in intelligibility associated with headset 
voltages above 3 volts. 
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of Table I, but they show that, with the 
particular microphone, high voice levels 
are generally not productive of increased 
intelligibility. 

Data supporting more directly the 
idea that high signal levels are not 
desirable, once a favorable signal-to- 
noise ratio is established, are found in 
experiments where rubber plugs were 
fitted into the outer ears of listeners. 
These plugs, designed to protect the ear 
from intense noise, decreased the level 
of the signal received at the ear-drum, 
but did not change essentially the rela- 
tive loudness of signal and noise. 

In two experiments, student pilots 
read intelligibility tests at two loudness 
levels. Listeners heard half the presenta- 
tions at each loudness while wearing 
rubber ear plugs (NDRC Ear Wardens) . 
Speakers used the hand-held micro- 
phone. Primarily the experiments were 
concerned with proper loudness for 
talking over aircraft radios. In each 
experiment one type of radio was used, 


TABLE IL.—INTELLIGIBILITY BY QUINTILES IN SrEFCH ENERGY Nosk-SHIFLDED 
MICROPHONE. 


N = 87. 


Median Headset 


Loudness Group Voltage 
Loudest 20% 4.7 
Next Loudest 20% 3.5 
Middle 20% _ 2.4 
Next Softest 20% 2.3 
Softest 20% 1.6 


Pre- Training Scores 


Post-Training Scores 


Mean Intelli- Median Headset Mean Intelli- 


gibility Voltage gibility 
78.4 59 82.2 
76.2 5.0 84.6 
76.7 4-2 83.7 
73-9 3-6 84.3 
67.2 2.4 80.2 


visual meter readings on 24 words. 


Trained speakers (right-hand col- 
umns, Table II) were less dependent 
upon voice level to produce intellig- 
ibility than were these same men before 
training. The noise levels employed in 
the mobile training unit were somewhat 
below those used in Laboratory experi- 
ments, averaging go — 100 db. 


The data presented in Table II are 
subject to the same limitations as those 


S.D. for initial intelligibility scores, 10.7; for final, 6.1. Voltage for each individual is average 


signals being transmitted by use of a 
dummy antenna to a receiver in the 
same room. Table III shows the effect 
of using ear plugs. . 
These experiments are typical of ones 
where Ear Wardens were used. There is 
no doubt that the ear plugs improve 
intelligibility of speech in airplane-type 
noise of 108-110 db. The presumption is 
therefore strong that the lowered intens- 
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EFFECT OF VERY LOUD SPEECH SIGNALS 


TABLE UL—INtTEvvicmititry oF RADIOTELEPHONE SIGNALS AS AFFECTED BY NDRC Ear 


WARDENS IN LIsTENERS’ Ears. 


Signal Voltage 


Signal Voltage 


(Average) (Average) 
2.5 4-5 
Condition N Mean Intelligibility Mean Intelligibility 
Radio A, without ear plugs 23 37:8 44-3 
Radio A, with ear plugs 23 44.0 53-4 
Radio B, without ear plugs 24 55.1 65.8 
Radio B, with ear plugs 24 59-4 72.3 


“t's” from distributions of differences indicate that the difference between intelligibility with 
and without ear plugs in each radio-loudness condition is significant at the 1% level of con- 


fidence. Earphones were low-impedance type. 


ity levels presented to the ear when 
plugs are worn are more intelligible 
than the same mixtures of speech-and- 
noise presented at higher intensity levels. 
Ear plugs may, however, attenuate high 
frequencies more than low frequencies. 
To the extent that this is true, it is 
possible that a change in frequency 
spectrum, rather than in intensity level, 
is responsible for the increased intellig- 
ibility. To cover this possibility, two 
experiments were designed to show how 
reduction in volume level, without sub- 
stantial change in frequency spectrum 
or signal-to-noise ratio, would affect 
intelligibility. The experiments are 
presented in some detail here. 


To determine whether listeners would 
be aided by loud signals when their 
loudness did not affect the relative 
strength of signal and noise or the fre- 
quency spectrum of the presented stim- 
uli, phonograph records were prepared, 
each containing a mixture of speech and 
noise. These records were played 
through headsets to listening panels at 
three levels of signal strength, the 


strongest level being heard both with 


and without NDRC Ear Wardens. 


By presenting the loudness-conditions 
and the individual records in counter- 


balanced order to four groups of listen- 


ers, the relative intelligibilities of the 
following four loudness conditions were 
determined: (a) speech output alone 
4 volts minus 30 db; (b) speech output 


alone 4 volts minus 20 db; (c) speech 
output alone 4 volts; (d) same as c but 
with listeners wearing Ear Wardens. 
These four conditions were heard in 
relation to a given order and set of 
word-lists by 18-24 subjects, then in a 
different order by 18-24 more, until four 
groups, totaling 215 listeners, had re- 
sponded. There were 10 experimental 
sessions, eight required by the design 
and two added to balance the size of the 


groups. 

The experiment was repeated as de- 
scribed above, using four groups totaling 
101 listeners, with the same set of word 
lists recorded against a relatively more 
intense noise. In this experiment, the 
voltage ratio of recorded speech-to-noise 
was 3/2.1 


1 Details of Method. The records used were 
prepared by “mixing” each of four recordings 
of word lists, made by a trained speaker who 
monitored his speech for loudness, with a record 
made by exposing a hand-held microphone to 
airplane-type noise in the voice communication 
classroom. In preparing the records, speech 
record and noise record were played simul- 
taneously, and the output from each was fed 
to one of the input channels of a Presto type 
87A recording amplifier. Output to the cutting 
head was monitored with a vacuum tube volt- 
meter. (The meter used was a Hewlitt Packard 
Vacuum Tube Voltmeter, Model 400A. The 
signal level was calculated from the usual volt- 
age formula.) To prepare records with voltage 
ratio of 2/1, the noise record was played alone 
and the amplifier gain adjusted to give a 2-volt 
output. The record of a word list was then 
played alone, and the gain on its input channel 
was adjusted to give meter swings of 4 volts. 
A record was then cut with both speech and 
noise being recorded simultaneously. One 24- 
word list was cut at 78 rpm. on each side of a 
12” glass base acetate disc. Words were pre- 
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TABLE oF Speech IN witH SIGNAL STRENGTH VARIED AND S/N 
Ratio CONSTANT. 


Signal (Speech) 
Level Relative 


a. Mean Intelligibilities 
S/N Ratio 2/1 


S/N Ratio 3/2 


to 4 Volts N Mean Score N Mean Score 
o VU 215 48.2 101 20.7 
o VU (Ears Plugged) 215 50.2 101 26.2 
—20 VU 215 57.0 101 36.4 
—3o VU 215 61.2 101 40.5 
Ave. S.D. within groups, 2/1, 18.0 
3/2, 10.6 
b. Statistical Analysis? 
2/1 3/2 
(—goVU) — (—2zo0 VU) 4-2 4.1 
(—20 VU) — (o VU, Ears Plugged) 6.8 10.2 
(o VU, Ears Plugged) — (o VU) 2.0 5.5 
(—20 VU) — (o VU) 8.8 15.7 
(—30 VU) — (o VU) 13.0 19.8 
“t” (5% level) 4.5 11.2 
“t” (1% level) 6.0 16.1 


2 Significance of differences is determined by the “t’-test based on methods x classes variance. 


The estimates of error are doubtless 


inflated, since no account 


is taken of the correlation 


between an individual’s score in one condition and another. This analysis is adequate to show 
the situation regarding the main points at issue, and since a finer estimate of error involved 
considerably greater labor, it was not attempted. Total N for 2/1 S/N Ratio was 215. Total 
N for 3/2 S/N Ratio was 101. 2/1 S/N Ratio experiment involved 10 class groups, with roughly 
1/4 the N hearing each condition in a given order and from one of the four records involved. 
3/2 S/N Ratio experiment involved 4 class groups, with roughly 1/4 hearing each of the 4 
conditions in a given order and from one of the four records involved. 


Table IV shows that the lower the 
loudness level used, the higher the aver- 
age intelligibility. While ear plugs gave 
an observed gain in intelligibility, it was 
not as great as was expected from prev- 
ious experiments, perhaps due to some 
imperfectly fitted plugs. 


sented in the carrier sentence, Number —— 
is 


Records were played to experimental subjects 
through a crystal phonograph pickup fed 
through an equalizer to the phonograph input 
of a 50-watt amplifier. Twenty-four low-im- 
pedance headsets were supplied across the 85- 
ohm output tap of the amplifier. Output levels 
were monitored by an especially calibrated VU 
meter, transformer coupled to the headset cir- 
cuit. This meter was calibrated in the circuit 
against a vacuum-tube voltmeter. 

Each record, consisting of 24 words, was 
played at a given loudness to approximately 
1/4 of the subjects in each experiment. Con- 
ditions 7, 2, 3, and 4 were rotated in order. 
Each condition was presented with each word 
list. Conditions were played one after the other 
with pauses only long enough to change rec- 
ords. 

The experiment was introduced with the fol- 


These experiments show that loud 
signals in the earphone circuit decrease 
intelligibility when they do not result 
in a more favorable signal-to-noise 
ratio. The records and _ phonograph 
pickup responses were the same in loud 
and soft conditions. Engineers reported 
that no substantial change in frequency 


lowing statement: “You are going to hear some 
phonograph records played at different loudness 
levels. Your job is to write down all the words 
you can understand or make good guesses at. If 
some of the conditions are too loud for you, 
you may take off your headset, but be sure to 
mark your paper to show this if you do.” 
(Eight respondents removed headsets during 
the experiments. Their results were not in- 
cluded in the computations.) 

Before the ear-plug condition plugs were dis- 
tributed to the subjects, who were instructed 
on fitting them into their ears. Subjects were 
told the plugs might make their ears more com- 
fortable in the following condition. Plug sizes 
were matched to ear size as well as practicable; 
however, with only 40 pairs of plugs available, 
and with 18-24 men in each group, fits were 
not always as good as was desired. 
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EFFECT OF VERY LOUD SPEECH SIGNALS 


response of the amplifier was present 
with the changes in loudness conditions. 
Changes in headphone response over the 
loudness range used are believed to be 
negligible. 

In summary, both experiments with 
noise-shielded microphones and with 
earplugs indicate that for the signal-to- 
noise ratios used, relatively weak signals 
are more intelligible than very strong 
ones. Proof that these effects are not due 
to peculiarities of equipment is offered 
in an experiment where frequency 
response and signal-to-noise ratio were 
held constant, but intensity of the stimu- 
lation was varied. In this experiment, 
weak signals were more intelligible than 
strong ones. 

The applicability of the instruction 
to speak loudly is, at least to some de- 
gree, dependent upon the type of equip- 
ment used. In AAF, most speakers use 
microphones which do not produce 
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highly favorable signal-to-noise ratios 
unless voice levels are high. 

This study indicates the value of 
eliminating noise from the environment. 
It leaves unexplored the possible effect 
of speaking loudly on articulation. Im- 
provement in either of these factors 
would be expected to increase intelli- 
gibility, quite apart from the level 
employed. 

So far as course content in voice com- 
munication courses is concerned, this 
study indicates that recommendations 
concerning desirable loudness _ level 
should be made in the light of the in- 
tensity of noise present at speaker sta- 
tions and the noise pick-up of the micro- 
phone. Since the factors determining 
desirable loudness level are complex, 
experimentation with any set of equip- 
ment, population, and _noise-situation 
variables is needed before recommenda- 
tions on loudness can safely be made. 
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of air-crew personnel 
indicated that pitch of voice was 
important in determining intelligibility, 
but differed concerning whether a high- 
or low-pitched voice produced the 
greater understandability. The high 
voice was more frequently believed to 
be superior in getting messages across 
under difficult conditions. The problem 
of voice pitch, augmented by these 
opinions, posed three questions for 
study. Does pitch affect intelligibility? 
Can any change from uninstructed pitch 
habits be readily taught? Are other voice 
adjustments to communication-in-noise 
accompanied by a typical pitch adjust- 
ment? 


Two methods were used for deter- 
mining pitch level, judgmental and 
physical. For the first, a judge estimated 
the pitch range represented among re- 
cordings of the subjects’ voices. Five 
pitch levels, representing this range, 
were recorded and compared with simi- 
lar portions of recorded intelligibility 
tests both for the same and different 
speakers. For a physical determination 
of pitch, portions of the speakers’ voices 
were transcribed on transparent discs 
and the sound wave impressions magni- 
fied and counted for a measured time 
interval.? 


1 Comparisons were made in 52 instances be- 
tween judgments of pitch by one judge and 
measurements of frequency from transparent 
recordings. The correlation between measure- 
ments and judgments shows the degree to which 
they agree in ranking speakers’ pitches—r (judg- 
ments vs. measurements) = .81 (r’s above .35 
are significant at the one per cent level of con- 
fidence). S.D. judgments = .98 (5 pitch classes 
used); S$.D. pitches measured = 43.8 (in c.p.s. 
on voiced portions of the word number). Selec- 
tion of uninstructed pitches for correlation 
guards against raising the correlation by ex- 
tending the range of measurement beyond that 
usually encountered. 


INTELLIGIBILITY RELATED TO PITCH 


I. P. BRACKETT 
Northwestern University 


I. Errecr oF PitcH AND RELATED IN- 
STRUCTION ON INTELLIGIBILITY 


Forty subjects using the hand-held 
microphone read intelligibility tests at 
each of four pitch levels, uninstructed, 
higher, very high, and low. All speaking 
was recorded.? The subjects controlled 
loudness as well as pitch. A VU meter 
was connected across the amplifier out- 
put and calibrated for 0 VU to equal 
4.2 volts.* 

Each speaker read a trial intelligibility 
test list (a second list, if necessary) to 
learn to speak at a loudness that would 
produce a peak meter deflection of o 
VU. He monitored the meter both dur- 


Since the five categories used in judgment of 
pitch represented a limitation of the method 
rather than an arbitrary compression of the 
scores into a smaller number of class intervals, 
Sheppard’s correction for small number of cate- 
gories was not applied. 

This degree of correspondence between judg- 
ments and physical measurement is high enough 
to indicate that, apart from errors in measure- 
ment, the techniques were measuring the same 
variable. 

Details of this method of determining fre- 
quency are presented following this article. 


2In this and other experiments where high- 
fidelity recording was used, the following equip- 
ment was employed: (1) recorders and ampli- 
fiers: Presto recording turntables MRC 16, with 
Presto 1-C cutting head; Presto recording ampli- 
fier 87A; (2) pickup: Brush PL-20; (3) equal- 
izer: network constructed to apply inverse feed- 
back at various ranges to make output through 
Presto 87A amplifier flat within 3 db between 
80 and 8,000 c.p.s.; (4) microphones: standard 
AAF microphones and amplifiers, used with 
voltage dividers in output, for recording inter- 
phone network conversation; Shure unidyne 
dynamic microphone, model 55C for studio re- 
cording. 

The spoken word lists were recorded on discs 
in a manner to obtain highest fidelity (outer 
3,” of 16” discs at 33 1/3 r.p.m.) and order of 
pitch conditions on the discs was rotated to 
counterbalance mechanical effects. 

3Zero VU, as used in this paper, represents 
the deflection of a Weston Model 802 VU meter 
connected across the headset circuit. Deflec- 


tions read were highest swings observed during 
production of a word. 
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INTELLIGIBILITY AND PITCH 


ing trial and final recording. All speak- 
ing was in noise. A second phase of 
each subject’s training—subsequent to 
the recording of uninstructed pitch— 
was to listen to the recording of his 
speaking. He was then requested to read 
at a noticeably higher or lower pitch 
than he heard from the record. In- 
structors helped him informally and 
guarded against extreme pitch changes. 
In all, each subject received from one- 
half to one and one-half hours of indi- 
vidual and informal instruction and 
practice in control of pitch under the 
supervision of professional speech teach- 
ers. 


Instruction began after the recording 
of uninstructed pitch and was progres- 
sively greater as the conditions became 
more difficult. Similarly, practice in 
loudness control was necessarily greater 
with difficult pitch conditions. 


The recordings were played back in 
noise to panels of 10-12 listeners with 
one panel hearing all the recordings 
from a single speaker. The order of 
presentation was counterbalanced. The 
playback level was set for the listener to 
hear the lists with 4.2 volts (peak) at 
the headset. 

No improvement resulted in intelli- 
gibility as a result of the speakers chang- 
ing pitch from their- uninstructed one, 
this being the one they automatically 
used when speaking in noise and at the 
indicated loudness. Table I shows the 
average intelligibility and indicates the 


TABLE I.—Pitcu Lever as RELATED TO 


INTELLIGIBILITY SCORE. N = 40. 
Pitch Level Mean Score S.D. 
Uninstructed 53-6 3-5 
Low 51.4 3.0 
High 48.64 
Very High 43-9° 6.1 


4 Significantly lower (2% level of confidence) 
than uninstructed mean. 

5 Significantly lower (1% level of confidence) 
than uninstructed mean. 
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spread of the 40 subjects for each of the 
four pitch conditions. 


II. Errecr oF INSTRUCTION ON PITCH 


The preceding section showed that no 
average improvement in_ intelligibility 
resulted from instruction to alter pitch 
systematically. A further investigation 
was made to find to what extent the in- 
struction affected pitch. Recordings of 
the four pitch conditions of 14 of the 
40 subjects discussed in the earlier sec- 
tion were re-recorded in part. Each item 
in the word lists was preceded by the 
introductory phrase, “Number 
Observation showed that the pitch of 
this phrase was well maintained through- 
out the sentence and the entire list. On 
the assumption that from three to five 
of these were representative of the “pitch 
condition” for a speaker, the word num- 
ber was selected for measurement. This 
word, as spoken three to five times by 
each of 14 subjects in each of four pitch 
conditions, was re-recorded on a trans- 
parent disc for pitch measurement. A 
500-cycle tone was recorded in adjacent 
grooves for a timing device. 

The analysis substantiated the obser- 
vation that change of voice frequency 
accompanied the decreased intelligibility 
scores of Table I. Second, relative to 
training in pitch, obviously brief indi- 
vidual instruction did produce pitch dif- 
ferences even when the speaker was 
monitoring himself over headphones and 
was surrounded by high-level noise. 
Other observations were that instruction 
to raise pitch increased pitch variability 
among a group, probably with some 
speakers going to another pitch plateau. 

Table II shows the average frequency 
of the 14 subjects for each of the pitch 
levels. The speakers were generally suc- 
cessful in attaining different pitches, or 
in other words, the training was success- 
ful. However, the changes from wunin- 
structed to higher pitches were of greater 
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TABLE IL—AVERAGE FREQUENCY OF SPEAKERS AT DIFFERENT PitcH Levers. N — 14. 
Q 


Mean c.p.s. for 
word number in 


a. Frequency Measurements 


Pitch carrier phrase S.D. Range 
Low 120.8 4.7 99-143 
Uninstructed 137.8 6.9 105-194 
High 186.5 10.2 135-235 
Very High 215.1 14.6 116-289 
b. Statistical Analysis 
N cases exhibiting 
change opposite to 
Comparison Difference — mean difference 
High — Uninstructed Pitch 49-7 6.40 o 
Very High — High Pitch 28.6 2.95 4 
“t” from distributions of differences (1%) = 3.01; (2%) = 2.65. 


extent than the shifts to a lower pitch. 
Irrespective of the high significance of 
the differences among levels, four sub- 
jects were unable to raise their pitches 
above high, and two were unsuccessful 
in speaking at a lower than uninstructed 
pitch. 


II]. FREQUENCY AS RELATED TO 
INTELLIGIBILITY 

Although it was apparent from the 
foregoing results that instruction to 
charge pitch results in lowered average 
performance, it was still possible that 
some pitches were more intelligible than 
others. To give information on_ this 
point, all pitches measured for the pre- 
ceding section, uninstructed, low, high, 
and very high, were placed in a frequen- 
cy distribution. The mean frequency 
was 169 c.p.s., S.D. 47.8. Speech samples 
were then classified by the number of 
§.D.’s above or below the mean pitch 
used, and analysis of variance applied to 
the intelligibility scores representing the 
recorded items of each pitch class. Table 
III represents the analysis. F for this 
analysis was 1.09, non-significant. Even 
when the three highest pitches were 
combined with the 14 next highest as a 
single class, F was only 1.50, also non- 
significant. 


TABLE Scorr RELATED TO 
Pitch (Low to High). 


Pitch Class Mean Intelligibility nN 


—1.5 to —.6 S.D. 51.0 24 
— 5 to SD. 52.5 15 
6 to 1.5 S.D. 46.6 14 
1.65 to 2.5 S.D. 45.5 3 


Among the 14 speech samples given 
at uninstructed pitch, there was no sub- 
stantial correlation between pitch and 
intelligibility, the rank difference cor- 
relation between pitch and intelligibility 
being .04. 


IV. Errecr or Noise on Pircu 

Twelve subjects who had received four 
hours training in use of interphone 
equipment were drilled in maintaining 
loudness. The aim was for them to 
maintain constant peak loudness (4.2 
volts at the headset) while reading in- 
telligibility tests in silence and in noise. 
They spoke with the hand-held micro- 
phone. Each speaker observed a VU 
meter placed across the headset circuit; 
in addition, he listened to the side-tone 
in his headset. Care was taken that 
pitch of voice was not mentioned before 
or during the drill-recording sessions. 
Phonograph recordings were made of 
two standardized 24-word intelligibility 
tests from each speaker. One list was 
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recorded from a quiet room, another 
from a room filled with 108-110 db air- 
craft-type noise. As described earlier, 
samples of the word number as used in 
intelligibility testing were used to de- 
termine the frequency of voice. 

Table IV compares the average fre- 
quency used in quiet with that used in 
noise. 


TABLE or Voice Usep IN 
Quiet AND IN Notse. N = 12. 


Average Frequency 
S. 


Room Condition C.p.s. 

Normal Quiet 154-9 19.3 
Noise, 108-110 db 171.1 14-9 
Difference in means = 16.2 
“t” from differences between individual's 


scores in the two situations, 3.39, significant at 
the 1% level. In only one case was pitch used 
in noise lower than that used in quiet. 


The subjects in this experiment ap- 
proximated the same VU-meter reading 
as nearly as possible in the two situations. 
However, two factors might operate to 
produce error. Since the noise pick-up 
of the hand-held microphone is included 
in the meter reading in noise, it is prob- 
able that the voice used in obtaining the 
meter reading was less loud in the noisy 
situation. On the other hand, produc- 
ing a specified meter reading is usually 
an approximation and subject to error 
in observation. The difference in mean 
pitch shown between quiet and noise 
in Table IV indicates that speakers tend 
to raise the pitch of the voice when 
speaking loudly in a noisy situation over 
the pitch used with a comparable loud- 
ness in a quiet room. 


V. Errect or LOUDNESS ON PITCH 
OF VOICE 


Twenty-seven subjects were selected 
because their normal speaking voices 
were judged to have average pitch. They 
recorded lists of words from intelligi- 
bility tests while maintaining a specified 
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level of loudness. The loudness was 
checked by having each reader observe 
the deflection of a VU meter connected 
across the output of an interphone am- 
plifier. A reading of o VU was equated 
to 4.2 volts in the output circuit. Each 
subiect read a list of 24 words at each 
of four loudness levels: -6, -3, 0, and 3 
VU. These loudnesses could be charac- 
terized as: soft-conversational, conversa- 
tional, appropriate -for - interphone - in- 
lorge - aircraft, and shouting. All lists 
were read in a quiet room. 

Recordings were later judged for aver- 
age pitch by comparing each speaker's 
reading of the word number with a 
standard record that contained five ref- 
erence pitches. These levels were desig- 
nated 1, 2, 3, 4, and 5, number 1 being 
the lowest. Recordings of 11 of the sub- 
jects were also analyzed physically. 

Analysis of the ratings shows that 
every speaker was judged to have raised 
the pitch of his voice in progressing from 
-6 VU to 3 VU. There was no case 
where a speaker was judged to lower 
his voice while increasing in loudness. 
Statistics relating to this comparison are 
shown in Table V. Twenty-six per cent 
of the speakers increased pitch one class; 
59 per cent increased two classes, 15 per 
cent increased pitch three classes. 
TABLE V.—ScCATTERGRAM OF JUDGED PiTcH OF 


Voice at —6 VU vs. Jupcep PitcH aT 
3 VU. N = 27. 


5 
Judged 4 1 
Pitch 3 5 3 
at—6 VU 2 3g 
(1 = low) 1 21 


Judged pitch at 3 VU (1 = low) 


Average pitch change from —6 VU to 3 VU 
was 1.9 Classes. The §S.D. of a distribution of 
differences was .12 classes, making 15.8. 

Results of the physical analysis shown 
in Table VI are in line with the judg- 
ments of pitch shown in Table V, but 
the measurements show finer discrimina- 
tion, making frequency increases evident 
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at each increase in loudness. The mean 
differences in frequency were reliable at 
each step of increase in loudness. 


Since a significant difference in pitch 
was produced for each step in loudness, 
using only 11 cases, it is obvious that 
the relationship between pitch and loud- 
ness is quite a definite one. In only two 
cases did an individual increase a “step” 
in loudness without also increasing fre- 
quency. There were no cases where a 
person increased two “steps” in loud- 
ness without increasing frequency. 


TABLE VI.—AvERAGE FREQUENCY OF VOICE 
Usep IN THE Worp number (see text) AT 
Four Lrvets oF Loupness. N = 11. 


a. Loudness and Frequency Measurements 
Loudness level 


relative to 4.2 volts Mean Frequency 


across earphone (C.p.s.) S.D. 
—6 VU 119.4 22.5 
—3 VU 136.0 35-1 

o VU 154.8 33-1 
3 VU 201.8 50.0 


b. Statistical Analysis 
Loudness Levels 


Compared Mean Difference “t” 
—6 VU vs. —3 VU 16.6 3.14 
—3 VU vs. o VU 18.8 4-72 

0 VU vs. 3 VU 47-0 5-47 

(1%) = 3-17 


“e” (2%) = 2.76 


The findings of these experiments may 
be summarized as follows. (1) Instruc- 
tion to raise the pitch of voice used in 
speaking over interphone or radio above 
that ordinarily used is detrimental to 
intelligibility when using the hand-held 
microphone at ground level. (2) An in- 
crease in loudness is accompanied by a 
rise in pitch. (3) Voice in noise is 
characterized by a higher pitch than 
voice in quiet. (4) No correlation was 
found between measured pitch of voice 
and intelligibility, either among persons 
selecting their own pitch, cr among per- 
sons instructed to use a pitch higher or 
lower than normal. 

These findings indicate that both the 
presence of noise and loud speech tend 
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to raise the pitch of voice used, and 
that training to assume a pitch other 
than that which results from instruc- 
tion in adequate loudness is not war- 
ranted. 

With regard to theoretical considera- 
tions, the lack of correlation between 
actual pitch of voice and intelligibility 
tends to support a hypothesis that pro- 
duction of the vowel-like sounds, which 
are the principal determiners of “pitch 
of voice” is not an important factor in 
making one man’s speech more under- 
standable than another's. Other evi- 
dence supporting this point of view is 
the relatively high and homogeneous 
understandability or “preservation-in- 
error” values found for vowel sounds in 
misunderstood words and in the fact 
that peak-clipped speech, where the 
loudness of consonant sound is enhanced 


‘in relation to that of the vowel sounds, 


is more understandable than normally 
transmitted and received speech. 

These lines of evidence support a hy- 
pothesis that speech which is highly 
intelligible in noise differs from less in- 
telligible speech chiefly in: (1) the 
loudness of the speech signal in relation 
to the noise, and (2) the degree to 
which consonant sounds are made recog- 
nizable. If this is true, changing the 
fundamental pitch of voice, or “pitch- 
ing the voice up” can be expected to 
have little or no beneficial effect upon 
intelligibility, since the most noticeable 
change affected by this instruction in- 
volves the vowel rather than the con- 
sonant sounds. 


A METHOD FOR DETERMINING PITCH 
oF Voice (MAsoNn) 


Devices, for example the phono- 
pheneloscope, have been used for meas- 


® The four most intelligible samples analyzed 
in the study on training to use a_ prescribed 
pitch of voice ranged from 113 to 147 ¢.>p.s.; 
the four samples lowest in intelligibility ranged 
from 121 to 187 c.p.s. 
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uring fundamental frequency of voice. 
They present essentially an enlarged pic- 
ture of the grooves in a phonograph 
record. The speed of revolution at which 
the phonograph record was made is 
known; therefore, the frequency of voice 
used can be determined by counting 
the number of wave patterns occurring 
within a given angle of rotation of the 
original record. 

In order to provide analyses cf pitch 
of voice similar to those made with the 
phonopheneloscope, a method employ- 
ing the Constant Grcove Speed Refer- 
ence Recorder was developed. This 
method gives results substantially similar 
to those furnished by photographic or 
transcribing methods for any speech 
passage that can be recorded in one 
revolution or less of the CGS turntable. 

The sample of speech to be studied is 
re-recorded on a transparent CGS record 
and speech waves are counted by pro- 
jecting the record grooves on a screen 
at 20-to 35-diameters enlargement by 
means of a slide projector. In_ the 
grooves, just inside and outside the sam- 
ple of speech to be analyzed, a 500-cycle 
tone is cut, which serves as a time line. 
This time groove is cut at an angular 
speed which differs by a neglible amount 
from that of the speech groove being 
analyzed. 

Figure 1 shows a block diagram of 
the equipment which is operated as 


follows. A phonograph record, contain- 


ing the groove segment to be analyzed, 
is placed upon the phonograph turn- 
‘table A, and played with a Brush PL-20 
crystal pickup. The operator, wearing 
Brush type A-1 crystal headphones, lis- 
tens to the record, noting the speeches 
just before and after the test segment. 
With switch D connecting the pickup to 
the recording circuit of CGS recorder, 
the cutting level of this machine is ad- 
justed to give a cut of as high ampli- 
tude as can be achieved without intro- 
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ducing pronounced distortion. The CGS 
recorder is not operating during this 
test, which serves the purposes of spot- 
ting the material to be analyzed and ad- 
justing the amplifier gain on the re- 
cording machine. The pickup is re- 
moved from the record to be analyzed. 
Switch D is placed so that the output 
of the oscillator is fed into the CGS 
recorder, and the gain control on the 
oscillator is adjusted to give a reason- 
ably bright glow in the recorder indi- 
cator lamps. 

Preparations are now complete to 
transcribe the speech segment to be 
analyzed to a transparent CGS disc. 
Switch D is placed to feed the oscillator 
into the cutting head of the CGS re- 
corder, which is set for slow speed, and 
this machine is started. The pickup is 
lowered on to the record to be anal- 
yzed, with the operator listening for 
the cue for the portion desired for 
analysis. At this cue, he places switch 
D to feed the phonograph pickup into 
the CGS recorder. When the passage 
to be analyzed has been played, the 
operator again places switch D so that 
the oscillator is fed to the cutting head. 
After an additional revolution of 500- 
cycle tone is recorded, the apparatus is 
shut down. 

A preliminary check on the quality 
of the CGS recording may be made by 
looking through it with a hand glass. 
If the speech grooves appear straight. 
with intermittent groups of nodes, and 
if the 500-cycle cut appears to have 
good amplitude, the record is satisfac- 
tory. If the recording head has been 
overdriven or if some elements of the 
CGS amplifier have been microphonic 
during recording, the speech waves ap- 
pear rough, nodes not being noticeably 
grouped. 

Assuming that the record is satisfac- 
tory, small identifying numbers, spaced 
every 10-15 degrees, are scribed around 
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A 


TURNTABLE WITH 
BRUSH PL-20 PICKUP 


CRYSTAL HEADPHONES 


BRUSH TYPE A-1 


AUDIO 
C OSCILLATOR 


SWITCH MONITORING PHONES 


C. G. S. RECORDER 
MODEL P2 
RECORDING AT SLOW 
GROOVE SPEED 


FicurRE 1.—EQUIPMENT FOR PRODUCING TRANSPARENT RECORD OF SPEECH SEGMENT 
WITH TIME BASE 


the inside of the record, near the label. The record is mounted on a fixture 
The drive-belt of the CGS recorder is which places the grooves in position for 
then removed, and the recorder is ad- projection in a projector for 31%” by 4” 
justed to playback position. By turning _ slides. The record grooves must be 
the turntable by hand, the point at slightly off-center with respect to the 
which the section to be analyzed begins lens system of the projector, to give 
can be found and appropriate notes oblique lighting across the plane of the 
made. record. The fixture employed allows the 
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record to be turned on its axis. A mag- 
nification of 20 to 35 diameters is re- 
quired for easy and accurate reading. 
In reading the record, corresponding 
points in speech waves can be marked 
off on a piece of paper superimposed 
over the screen, distinctive marks being 
made through nodes in the 500-cycle 
line in the adjacent groove. By applying 
the formula a/x' = b/500, where a is the 
number of speech wave patterns marked, 
x is the unknown fundamental fre- 
quency, and b is the number of 500- 
cycle nodes adjacent to the speech 
sample, the average voice frequency can 


be determined. Since the angular speed 
of rotation is different for each groove, 
this reading will be very slightly in error. 
Repeated counts using the time lines 
just inside and outside the speech sample 
show no difference in results, indicating 
that this source of error is negligible. 
Since comparative, rather than absolute 


measurements were required in these 


studies, the calibration of the oscillator 


as it was received from the factory was 


‘acceptable. A slide projector with a well 


corrected lens was used, in order to 


minimize eye fatigue in counting waves. 
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a, report presents the results of: 
(1) a statistical study of phonetic 
characteristics as related to intelligibility 


lable words used in testing United 
States Army Air Forces personnel: and 
(2) a follow-up investigation of the 
relative capacities of speech sounds to 
preserve identity in words incorrectly 
transmitted. Both researches were di- 
rected toward the accumulation of an- 
swers to a specific practical question: 
How may telephonic communication in 
the noise typically present within the 
fuselage of a military airplane in flight 
be made more effective? 

The following summary of the results 
is presented prior to the descriptions of 
the studies. 

(1) Words of two syllables were found to 
be more intelligible than words of one syllable. 


In groups of words containing the same num 


ber of speech sounds, those of two syllables 
were found to be transcribed correctly approx- 


imately forty per cent more often than words 
of one syllable. 

(2) Some evidence was found of significant re- 
lationship between intelligibility and pattern of 
stress in words of two syllables. 

(3) One class of speech sounds, the fricatives 
in the words sit, fox, and think, was found to 
be consistently associated with low intelligibil- 
ity. 

(4) The class of sounds found to be most 
significantly associated with high intelligibility 
(in two-syllable words only) was that including 
the diphthongs in the words tone, cow, boy, 
high, and tape. 

(5) In two-syllable words also, the sounds in- 
dicated in risk, watt, which, how, and year were 
found to be associated with better-than-average 
intelligibility. 

(6) Words the articulation of which involves 
protrusion and recession of the lips were found 
to be reliably superior to others in intelligibil- 
ity. 


in a list of 898 common one and two-syl- . 


STUDY OF PHONETIC FACTORS IN RELATION TO ACCURACY OF 
TRANSMISSION OF WORDS IN AIRPLANE NOISE 


WILMER E. STEVENS 
University of Wyoming 


(7) Vowels, including diphthongs, were found 
to preserve phonetic identity more often than 
consonants, from 51% to go% of the time, 
mean for fifteen sounds, 75.2%. 

(8) For vowels a significant variation in value 
associated with manner of enunciation was 
found to exist: mean values—for diphthongs, 
86%; for front vowels, 76%; for middle vowels, 
71%; for back vowels, 58%. 

(9) The values were found to range from 


- 0.22% tw 82% for consonants occurring before 


the vowel, from 0.20% to 82% for consonants 
occurring after the vowel. 


(10) The values for certain consonants were 
found to shift significantly according to position 
relative to the vowel. Values for the nasals, me 
and no, for example, were found to be 11% 
and 0.22%, respectively, before the vowel and 
69% and 54%, respectively, after the vowel. 


This first study involved statistical 
analysis of intelligibility test records 
made by aviation cadets, enlisted men 
awaiting assignment to flying training, 
flying instructors, and others. The tests 
had been administered either in the 
course of research or as part of the 
training program growing out of that 
research. For the understanding of this 
report, attention needs be drawn to the 
following items. (1) The intelligibility 
rating of any given word is the percen- 
tage correct of all transcriptions of it. 
(2) Each of the 898 words in the list 
subjected to analysis had been spoken 
by 30-50 persons, heard and transcribed 
by 300-500 others. The chances for error 
on each word, it is evident, ranged in 
number from 9,000 to 25,000. Hence, 
though such matters as regional varia- 
tion in dialect were not taken into 
account, the rating achieved by any 
given word is probably a reliable index 
of intelligibility characteristics inherent 
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TABLE I.—Tue INTELLIGIBILITY VALUES OF WorpS WHEN SPOKEN UNDER DIFFERENT CONDITIONS. 


Situation N Words r Mean x Mean y 

Voice Communication Laboratory Ratings (x) 

vs. Training Installation Ratings (y) 460 92 61.4 63.0 
Ratings in Moderate Noise (x) vs. Ratings 

in Severe Noise (y) 429 82 62.7 40.9 
Ratings Given at Psycho-Acoustic Laboratory, 

Harvard (x) vs. Ratings Given at Voice 

Communication Laboratory (y), source of 

values used in this study 55 .70 68.0 56.7 


All S.D.’s of distributions ranged between 17.0 and 21.0 


in it.) (3) The correlations shown in 
Table I indicate the stability of the 
ratings upon which this study was based. 

For the study of preservation-in-error 
tendencies of speech sounds one hundred 
fifty one-syllable words were selected 
from the intelligibility tests. Selection 
was random except that (1) all words 
containing the combination of sounds 
indicated in hart were eliminated be- 
cause of the extreme variation in pro- 
nunciation of that combination, and 
(2) all words containing the diphthongs 
in boy and cow were included in order 
to make the samples of those sounds 


“more adequate in size. The first oper- 


ation performed in this study was an 
analysis of the data to evalute the effect 
of syllabication upon __ intelligibility. 
Approximately half of the subjects had 
been tested in noise at a level of about 

1In another study the mean_ intelligibility 
of trainees from the 6th Service Command was 


found to be more than 10% above that of 
trainees from the 8th Service Command. 


TABLE VALUES OF 


114 decibels, while the other half had 
been tested in noise at a level of 108-110 
db. Results are set out in Table I. 


Obviously, two-syllable words, as a 
class, were differentiated from  one- 
syllable words by greater average length. 
The lengths of words were best meas- 
ured, for purpose of analysis, by actually 
timing speakers’ utterance of them dur- 
ing administration of tests. The meas- 
ures obtained would be influenced by 
at least four factors: (1) the complexity 
(number of separate speech sounds 
contained) of words, (2) the average 
relative length of those sounds, (3) 
extension of some words in articula- 
tion of syllables, (4) the speech habits 
of individual speakers. Factor (1) might 
also have an effect upon intelligibility 
independent of its relation to the lengths 
of words. Since measures of time of utter- 
ance were not available, the effect of 
syllabication upon intelligibility could 
be checked only by an analysis compar- 


OnrE-SYLLABLE vs. Two-SyLLABLE Worps. 


Group of Words N Mean Score S.D. 

One-Syllable Words Spoken in Moderately Difficult 

Noise Conditions (108-110 db) 230 51.6 19.2 
Two-Syllable Words Spoken in Moderately Difficult 

Noise Conditions (108-110 db) 239 . 70.0 13.6 
One-Syllable Words Spoken in Very Difficult 

Noise Conditions (114 db) 139 30.0 17.8 
Two-Syllable Words Spoken in Very Difficult 

Noise Conditions (114 db) 290 46.0 
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ing factors (1) and (3). For this purpose 
the words were cross-classified according 
to the number of separate speech sounds 
contained. Findings are displayed in 
Table III. 

Inspection having indicated a possible 
relationship between stress pattern and 
intelligibility, two-syllable words were 
classified upon the basis of their pronun- 
ciation fitting one of the following pat- 
terns: (1) secondary accent-primary ac- 
cent, (2) primary accent-secondary ac- 
cent, (3) no accent-accent, (4) accent- 
no accent. The classification was done 
by two persons, each working independ- 
ently, and their classifications of all 
words used in the analysis were the same. 
Results of the analysis appear in Table 
IV. 


It should be the 
syllable group contained some highiy 
intelligible words, as did all the other 
groups. Moreover, the groups averaging 
highest contained some words of low “ 
intelligibility. 


noted that 


one- 


In preparing for analysis to discover 
their influence upon the intelligibility 
of words, 43 sounds characteristic of 
English speech were differentiated. 
Three of these, azure, ring, you, did not 
occur in the sample of words available 
for analysis. Since an attempt to consider 
each sound separately could not have 
given statistically reliable results, due 
to the inadequacy of samples available 
in many cases, the 40 remaining sounds 
were grouped into 18 classes upon a 
basis of phonetic similarity (Table V). 


TABLE IUL—INTEtvLiciBitiry oF Worps RELATED TO Worp LENGTH AND SYLLABIFICATION. 


One-Syllable Words 


Number of 


Two-Syllable Words 


Sounds N Mean Score N Mean Score 
2 13 51.3 
3 go 53-7 
{ 108 50.5 42 69.9 
5 23 49.0 115 69.6 
6 61 69.3 
7 or more 21 72.5 
Variance within groups = 14.7 Variance within groups = 12.1 
Variance between group means = 9.1 Variance between group means = 2.3 
TABLE IV.—INTELLIGIBILITY OF Worps RELATED TO STRESS PATTERN. 
a. Moderately Difficult Noise Conditions 
Stress Pattern N Sample Word Mean $.D 
- — 42 provide 753 14.7 
_ - 55 elbow 75.2 17.8 
oe 29 attract 74-5° 17-9 
— 113 foolish 65.47 16.1 
One Syllable 230 quick 51.7 19.2 
Total 469 60.8 20.4 
b. Very Difficult Noise Conditions 
Stress Pattern N Sample Word Mean S.D. 
_— — 51 convince 54-23 17.0 
_ - 62 football 47-0 16.0 
— 32 abound 54-72 16.1 
_ 145 sassy 40.62 16.7 
One Syllable 139 church 30.0 17.8 
Total 429 40.9 19.0 
2 Difference between this and next lower mean significant at the 1% level. 
8 Difference between this and next lower mean significant at the 5% level. 
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TABLE V.—Sounpv Groups Usep IN PHONETIC ANALYSES AND PERCENTAGE OF ONE AND Two- 
SYLLABLE Worps CONTAINING ONE OR More REPRESENTATIVES OF EACH Group. 


Group: Words Containing 
One or More of the Sounds 
Italicized 


sit, fox, or thin — 


% of 369 One- 
Syllable Words 
Containing 


%o of 520 Two- 
Syllable Words 
Containing 


32.0 40.3 
tap, pot, or bat 44-7 51-9 
bat (a sub-group of above) 8.1 13.4 
chat or shot 9.5 6.8 
dot or get 21.7 27.6 
kit 19.0 5.2 
which, how, watt, or year 12.9 10.8 
jot 7.0 4-7 
zip, vat, or that 5-9 15.3 
me or no 29.3 47-1 
roar 29.3 29.3 
let 18.4 32.1 
term or tap 18.7 35-1 
top or talk 17.1 16.8 
teem, tip, took, or tool 5-4 42.4 
tone, cow, boy, high, or tape 27-9 29.6 
flatter or err 4.8 20.0 
ton 5-7 9.6 
alone 0.0 30.1 

The results of this analysis are tabulated in Tables VIII and IX. 
TABLE RELATED To f, @. 
Mean Score Mean Score % Con- 
N Words Con- Words Not taining one 

Condition Words taining Containing or more S.D.4 
One-syllable (Moderate Noise) 230 40.1 576 35-2 19.2 
One-syllable (Intense Noise) 139 20.1 33-7 26.6 17.8 
Two-syllable (Moderate Noise) 239 63.4 73.5 40.6 13.6 
Two-syllable (Intense Noise) 290 42.9 48.1 40.0 17.3 


All the differences between “mean-containing’ and “mean-not-containing” signif- 
icant at the 1% level, except the last, which is significant at the 2% level. 


4Computed for words containing and words not containing combined. 


Then, keeping one-syllable words sep- 
arate from two-syllable words and words 
spoken in moderate noise separate from 
words spoken in intense noise, mean 
intelligibility scores were computed, (1) 
for all words containing one or more of 
the’ sounds in a given class and (2) for 
all words containing none of the sounds 
in that class. 

One class of sounds only was found to 
be consistently associated with signifi- 
cant differences in mean intelligibility, 
that comprising the three unvoiced 
fricatives, sit, fox, and thin. Results of 
the analysis dealing with this class of 
sounds have been consolidated in Table 
VI. 

Some sounds, like which and prune, 
always require the speaker to protrude 


the lips. Other sounds involve lip pro- 
trusions only when in certain combin- 
ations; the word chin does not involve 
protruding lips; the word church does; 
the word purr does not. Inspection hav- 
ing indicated these protrusions to be 
associated with high intelligibility, an 
analysis of that factor in two-syllable 
words spoken in severe noise was made 
(Table VII). 


TABLE RELATED To Lip 


PROTRUSION. 

N Mean5 
Words containing none 171 7-76 
Words containing one 114 9.96 
Words containing two 6 11.83 

Variance within groups = 10.80; between 


group means = 195.25; F = 18.09. 


5 Convert means by multiplying by 5 and 
adding 2.5. Convert variances by multiplying 
by 25. 
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‘TABLE VIII—Sratistics RELATING TO PRESENCE OR ABSENCE OF SPEECH SOUNDS; VERY DIFFICULT 
COoNpDITIONS.® 


Class N 
Sounds 


of N Not 


Variance 
Mean’ Variance Between 
Mean Not Within Group 


Containing Containing Containing Containing Groups Means’ F 


sit, fox, thin 37 102 


top, pot, bat 54 85 
bat 9 130 
chat, shot 7 132 
dot, get 28 rut 
kit 23 116 


which, how, watt, year 27 112 
jot 9 130 
zip, vat, that 12 127 
me, no 32 107 
roar 31 108 
let 25 14 
ten, tap 27 
top, talk 22 
teem, tip, took, tool 34 
tone, cow, boy, high, tape 39 
err 9 
ton 8 


sit, fox, thin 116 174 
top, pot, bat 148 142 
bat 37 253 
chat, shot 20 270 
dot, get 7 212 
kit 69 221 
which, how, watt, year 34 256 
jot 17 273 
zip, vat, that 46 244 
me, no 148 142 
roar go 200 
let 84 206 
ten, tap 99 191 
top, talk 50 240 
teem, tip, téok, tool 127 163 
tone, cow, boy, high, tape 97 193 
err 239 
ton 27 263 
82 208 


a. One-syllable words 


3-51 6.24 11.36 200.85 17.69 
5-14 5-74 12.74 11.33 1-18 
6.78 5-84 11.93 28.67 2.41 
4-86 5-55 12.80 2.65, 

6.25 5.32 12.68 18.81 1.48 
4-91 5-63 12.75 9.48 

6.28 5.40 12.65 84.20 6.65 
7-44 5.38 12.56 35.85 2.85 
3.50 5-70 12.43 53-10 4.27 
4-34 5-71 12.69 18.20 1.44 
6.26 5-30 12.66 21.94 1.73 
7.00 5-18 12.33 67.32 5.46 
5.417 

5.18 

48 

6.51 

6.56 

4-71 


b. Two-syllable words? 


8.09 g.12 11.89 74.52 6.27 
8.66 8.76 12.14 56 

9-65 8.57 12.02 37.11 3.08 
7-90 8.68 ~ 12.09 14.51 1.20 
8.7 8.71 12.19 49 

8.64 8.73 12.10 1.30 
10.70 8.57 11.63 82.06 7.06 
8.88 8.69 12.14 31 

9.26 8.60 12.09 15.61 1.29 
8.12 9-32 8.31 103.86 12.50 
9.84 8.20 11.56 168.21 14.55 
9.23 8.50 12.04 31.22 2.59 
8.48 8.83 12.12 7.67 

Q.22 8.60 12.09 15.65 1.29 
8.51 12.12 8.69 

9-74 8.19 11.64 156.48 13.44 
g.18 8.61 21.10 13.79 1.14 
7-59 8.82 12.02 31.11 2.59 
8.07 8.96 11.98 46.04 3.84 


Group mean (total) = 5.50. 


In studying the effect of phonetic 
characteristics upon the intelligibility of 
words, problems are encountered which 
are not present in the study of aud- 
ibility of speech sounds as such. When 
only audibility is in question, nonsense 
syllables can be made up which place 
any sound in all its possible syllabic 
positions, and its audibility as part of 
such nonsense syllables can be deter- 


6 All scores given in this table are coded scores from work sheets. The mid-point of coded 
score zero interval is 2.5% intelligibility. Class interval is 5% intelligibility. Convert variances 
by mulitplying by 25. Convert means by multiplying by 5, then adding 2.5. E 

7 Variance within groups = 12.02; between group means = 31.43; F = 3.84; N = 139; 


mined for any given noise condition by 
articulation tests. Much of this work has 
been done by Bell Telephone Labora- 
tories. 

Sounds in words are different in at 
least three important respects from 
sounds in nonsense syllables: (1) A 
sound in a word shares the burden of 
insuring intelligibility with other sounds, 
and it may serve its purpose either by 
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TABLE IX.—Sratistics RELATING TO PRESENCE OR ABSENCE OF SPEECH SOUNDS; MODERATELY 
DiFFICULT Noise CONpDITIONS.§ 


Variance 
Class N Mean’ Variance Between 
of N Not Mean Not Within Group 
Sounds Containing Containing Containing Containing Groups Means’ F 
a. One-syllable words 
sit, fox, thin 81 149 7-52 11.03 11.98 648.58 54.1 
top, pot, bat 1 119 Q.22 10.34 14.52 70.86 4.88 
bat 21 209 10.90 9.68 14-71 27.28 «1.85 
chat, shot 28 202 11.57 9.55 14.39 91-54 6.36 
dot, get 52 178 11.83 9.20 13.62 276.03 20.28 
kit 47 183 10.64 9.58 14.65 40.60 2.77 
which, how, watt, year 21 209 10.72 9.78 14.83 1.29 
jot ‘ 7 223 8.14 9.85, 14-74 2003 1.36 
zip, vat, that 10 220 8.80 9.84 14.78 9.78 
me, no 76 154 g.22 10,08 14.67 36.69 2.50 
roar 77 15 10.20 9.60 14-75 18.15 1.23 
let 43 187 10.33 9.67 14.76 14.62 1.01 
ten, tap 2 8.839 
top, talk 40 10.20 
teem, tip, took, tool 60 9.48 
tone, cow, boy, high, tape 64 10.39 
err 10 11.67 
ton 14 9-54 
b. Two-syllable words® 
sit, fox, thin 97 142 11.97 14.20 13.58 287.61 21.18 
top, pot, bat 127 112 12.94 13.71 14.64 34.80 2.38 
bat 33 206 12.68 13.59 14.85 36.27 2.44 
chat, shot 17 222 13.94 13.43 12.01 5-73 
dot, get 68 171 14-32 13.12 11.72 72.44 6.18 
kit 64 175 13.50 13-45 14-73 50 
which, how, watt, year 23 216 14.50 13.35 11.62 58.96 5.07 
jot 8 231 13.88 13-45 12.02 2.12 
zip, vat, that 5 204 14.14 13.35 11.94 20.24 1.69 
me, no 101 138 12.54 14.14 11.39 150.60 13.22 
roar 65 174 14.57 13.05 11.16 110.78 9.93 
let 83 156 14.08 13.14 11.86 50.66 4.29 
ten, tap 87 152 13.09 13.68 11.94 21.20 1.78 
top, talk 39 200 13.72 13.42 12.01 4-42 
teem, tip, took, tool 97 142 13.64 13.34 12.00 6.02 
tone, cow, boy, high, tape 60 179 14-73 13.04 11.46 135.10 12.46 
err, father 5S 184 12.66 13.71 11.82 49.84 4.22 
ton 24 215 12.88 13.53 11.99 9-99 
alone 77 162 12.73 13.82 11.7 63.28 5.38 


8 All scores given in this table are coded scores from work sheets. The mid-point of coded 
score zero interval is 2.5% intelligibility. Class interval is 5% intelligibility. Convert variances 
by multiplying by 25. Convert means by multiplying by 5, then adding 2.5. 

® Variance within groups = 14.22; between group means = 26.29; F = 1.85; N = 290; 


Group Mean (Total) = 9.83. 


being heard or simply by providing a 
period of silence or unidentified sound 
of appropriate length between other 
identified sounds. (2) A sound in a word 
may be limited by the language in the 
combinations in which it is likely to be 
found, so that a survey of its audibility 
in all conceiveable positions may give 
either too high or too low an estimate 
of the probability of its being recognized 


in a word. (3) Words may vary in the 
completeness with which they need to 
be suggested to the listener, due to 
emotional coloring, familiarity, or ap- 
propriateness to “set”. For example, an 
aircrew member might readily be more 
alert to the word propeller than to the 
word compulsion. 

These differences make a determina- 
tion of the effect of any sound upon 
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word intelligibility a separate problem 
from its audibility. In the word, effec- 
tiveness in the usual working matrix is 
the question, rather than effectiveness 
in artificially controlled conditions. The 
structure of language limits investiga- 
tion of sounds in words rather seriously. 
In the earlier of these studies the pur- 
pose was to test each of forty speech 
sounds to see if its presence helped or 
hindered intelligibility of words con- 
taining it. With 898 words for which 
stable intelligibility values were at hand, 
it was necessary to group sounds in order 
to present a sufficient number of words 
in each class for statistical analysis. 


In dealing with the degree to which a 
sound itself is properly understood in 
words, it is possible to study all words 
correctly understood plus all errors in 
which the sound under consideration is 
preserved. It is also possible to deal only 
with those instances in which the sound 
is recognized but the word is missed. 
The present study deals with sounds 
correctly understood in misunderstood 
words. This alternative was chosen be- 
cause it appeared more likely to give a 
critical valuation of the understand- 
ability of the sound. Where the whole 
word is understood, each sound shares 
responsibility with many others. Where 
it is one of fewer elements properly 
understood, its understandability is 
more clearly defined. The confining of 
this study to words of one svllable, also, 
was for the purpose of limiting the 
number of factors which might influence 
results. Accordingly, it should be noted, 
the results reported are significant as 
evidence of promise in the line of investi- 
gation herein begun. 


In carrying out the study: (1) All 
erroneous transcriptions of each stim- 
ulus-word were recorded, (2) A preser- 
vation-in-error value for each vowel 
(or diphthong) was computed by 
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a. grouping together all words containing the 
sound 


b counting the total number of erroneous 
transcriptions 


c. counting the erroneous transcriptions in 
which the same vowel sound as that in the 
stimulus-word appeared 


d. dividing the second figure by the first1° 


(3) Preservation-in-error values were 
computed in the same way for each 
consonant, except that for each sound 
found both before and after the vowel 
(anywhere in the sample, not necessarily 
in the same word) two separate values 
were computed. (Tables XI, XII) 


Inasmuch as one of the immediate 
purposes of these studies was to discover 
a method for ‘predicting’ the intellig- 
ibility of words to be used by flying 
personnel in standard messages, an 
analysis was made to disclose the rela- 
tionship between the preservation-in- 
error value of sounds and the intellig- 
ibility of whole words. The value for 
ach sound was expressed as a single 
digit (nearest 10°) and an average 
value computed for each of 499 words of 
known intelligibility rating. The results 
set out in Table XII reveal a definite 
positive relation between the intellig- 
ibility of words and the. average preser- 
vation-in-errcr value of the sounds they 
contain. The average preservation-in- 
error values in the first part of the table 
were computed for all sounds contained 
in the words, while those in. the second 


part were computed for consonants only. 


To a considerable extent, the results 
of these investigations speak ior them- 


10 Strict phoneticians will note the possibilities 
of error in this analysis resulting from (1) un- 
certainty as to the exact pronunciation given 
stimulus words by speakers and (2) uncertainty 
as to the pronunciation listeners would give the 
spelling of their responses. To be very exact, 
the study should be made with subjects trained 
in phonetics, writing their responses in phonetic 
symbols. These refinements might give inter- 
esting results. 
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TABLE X.—PRresFeRVATION-IN-ERROR VALUES OF VOWEL SOUNDS AND VoweELs Most OFTEN 
SUBSTITUTED FOR EACH. 


Vowel N P-i-E N 

Spoken Words Values Errors Substitutions 
tape 16 go 920 

teem 1 86 486 tip, 5%: tape, 5% 
type a 87 276 tap. 7% 

tone 12 85 523 : 

cow 3 84 228 ten, 6% 

err "7 79 259 talk, 7% 

tip 19 ; 78 1269 ten, 13% 

tap 5 77 gl4 ten, 14% 

ton 10 71 647 top. 7% 

top 13 67 68 | talk, 10% 

tool 6 66 322 tone, 12% 

ten 14 64 1028 tip, 21% 

boy 3 64 72 top, 11% 

took 5 60 58 top, 31% 

talk 9 AL 538 top, 28% 


TABLE VALUES OF CONSONANTS. 


Spoken Before Vowel 


Spoken After Vowel 
N 


N ) 

N Words N P-i-E N Words N P-i-E 
Sound Words Substituted — Errors Value Words, Substituted Errors Value 
ten 10 225 677 “19 30 740 2238 38 
dot 12° 365 695 28 13 200 {60 48 
kit 18 334 974 40 15 211 18 
get 9 168 joR 26 1 5 8 25 
pot 17 313 1000 33 9 170 446 47 
bat 12 244 570 13 
chat 6 Re 278 29 8 147 394 2s 
jot 2 50 186 3° 2 29 166 38 
sit 17 373 115! 37 16 400 1290 17 
shot 7 105 276 49 ! 60 177 60 
vat 2 19 192 2 3 85 185 18 
fox 6 136 334 8 3 82 23, 
thin 3 69 217 1 5 125 499 0.20 
no 6 227 j62 0.22 29 685 1895 54 
me { 65, 213 i 12 230 635 69 
watt 12 172 573 71 _ 
which 5 8o 246 3 
year 1 13 54 
let 13 253 662 71 9 152 503 82 
roar 31 650 1734 82 15 274 g62 59 
how 5 99 362 49 


11 The blanks in some columns mean that the sound did not appear in the position indicated 
in the sample of words analyzed. The sounds azure, that, and ring appeared in no words in 


the sample. 


selves. However, there appears to be 
some point in drawing attention to the 
following items. 

(1) Study of the preservation-in-error 
capacities of sounds appears to be the 
line of research that promises most for 
the understanding of intelligibility in 
noise. 

(2) If the problem of communication 


in noise is important, the data on pre- 
servation-in-error values of sounds 
should be increased through studies of 
larger and different samples of words. 

(3) The data herein set out furnish 
a useful basis upon which to select a 
vocabulary for use in circumstances 
detrimental to effective oral communica- 
tion because of surrounding noise. 
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TABLE XII—RELATION BETWEEN Worp INTELLIGIBILITY AND PRESERVATION-IN-ERROR VALUES 
OF COMPONENT SOUNDS. 


a. All Sounds 


P-i-E N Mean Intelligi- ._ Weighted Average 
Values Words bility Score S.D. F 
$.01-4.00 33 47-02 
4.01-5.00 58 48.70 18.51 6.92 
5.01-6.00 95 51.02 
6.01-7.00 47 60.26 
Total 233 
3.01-4.00 28 66.76 
4-01-5.00 81 65.79 17.03 4-00 
5.01-6.00 99 72.80 
6.01-7.00 28 76.07 
Total 236 
b. Consonants Only 
0-2.5 25 47-30 
2.6-4.0 73 45-92 18.53 6.75 
4-1-5.5 99 52-90 
5-6-7.0 36 61.96 
Total 233 
.0-2.5, 26 66.76 
2.6-4.0 87 63.82 16.68 | 8.75 
4-1-5-5 103 74-58 
5.6-7.0 20 79.08 
Total 236 


All values for F, variance between groups — variance within groups, significant at the 1% level. 
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IMPROVEMENT OF LISTENER PERFORMANCE IN NOISE 


HARRY M. MASON 
Purdue University 


HE interphone system used in Army 

Air Forces voice-communication 
training involves a party-line of 8-12 
speaking-listening stations, so arranged 
that speech originating in any one is 
heard in all the others. Since only one 
person speaks at a time, each trainee 
spends more of his training time in lis- 
tening than in speaking, saying one mes- 
sage and hearing 7-11. This dispropor- 
tionate amount of listening prompted 
investigation of whether the chief bene- 
fit derived from training might be in- 
creased listening ability rather than 
speaking ability. If so, stimuli for listen- 
ing training could be prepared, recorded, 
and presented through loudspeakers or 
headsets, and the number of students per 
instructor greatly increased. In consider- 
ing revisions of training in voice com- 
munication, the available evidence 
which bore upon these questions was 
studied. First, the internal evidence pre- 
sent in tests where all students both 
spoke and listened, before and after 
training, was examined; then experi- 
ments were planned to estimate the de- 
gree to which listening ability responded 
to training. 


Evidence suggesting that speaking 
might be more amenable to training 
than listening ability appeared in pre- 
training score distributions. The total 
score for a group where each member 
acted in rotation as speaker and listener, 
was the same, whether listening or 
speaking performance scores were com- 
puted. The men of a group were more 
alike, however, in listening than in 
speaking, as indicated by the smaller 
scatter of listening scores. This may be 
seen through comparing the values of 
columns 4 and 6 of Table I, which pre- 


sents standard deviations of typical 
score distributions of men as speakers 
and listeners before training. 


When course contents had been de- 
veloped which were effective in improve- 
ing communication efficiency, it was pos- 
sible to carry comparisons of scatter one 
step further. Changes in scatter due to 
training could be compared for speak- 
ing performance and listening perform- 
ance. Columns 5 and 7 of Table I show 
scatter of typical post-training perform- 
ance in speaking and listening. Results 
in this table, typical of the great major- 
itv of groups tested, show a significant 
decrease in scatter of speaker scores 
(three points), but the variability of 
listening performance remains substan- 
tially unaltered after training (8.0 vs. 
7.7; 10.9 vs. 9.4). The ratio between 
initial- final-speaking variances 
(F-test) is significant at the 1% level 
in each case. The situation of speaking- 
listening scores is one in which a group 
receives training which is evaluated by 
two measures derived from one test, 
essentially two tests. In one the distribu- 
tion remained the same in scatter, in the 
other it decreased. Although this is not 
conclusive evidence that speaking train- 
ing is more important than listening 
training, it shows that only in speaking 
abilities did a group become more 
nearly alike through training. Since 
training typically affects both pecrform- 
ance level and scatter in ability, and 
training designed to assure a minimum 
level of competence characteristically 
decreases scatter, the presumption is 
strong that this training course affected 
speaking, not listening abilities. Cor- 
relations between listener-speaker scores 
for individuals generally low 
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TABLE I.—Spreap oF SPEAKER AND LISTENER SCORES AS AFFECTED BY TRAINING IN SPEAKING. 


Group Intelligibility 


Initial Final 
Test N Mean Mean 
Write Down 104 57-3 73-9 
24-Word 
Write Down 141 52.6 70.0 


Speaker S.D. Listener S.D. 
Initial Final Initial Final 
13.2 10.1 10.9 9-4 
11.4 8.5 8.0 7-7 


(median r for 3 trained groups is .26). 
Fairly convincing proof that speaking 
performance was the critical variable 
was also present in the general circum- 
stance in which test scores did not show 
significant student improvement with 
long periods of training (7-8 hours) 
until more selected voice-training tech- 
niques were introduced. The subscquent 
improvement due to training is illus- 
trated in the gains that are shown in 
Table VII. These resulted from a 4-hour 
training period, including two one-half 
hour periods of intelligibility testing. 

The amount of listening improvement 
through training was still uncertain. 
Attempts were made, therefore, to im- 
prove listening ability, and to measure 
the effects of aspects of listening exper- 
ience as they affected communication 
efficiency. In a study designed to evaluate 
listening training, 155 AAF cadets (un- 
assigned) took a recorded intelligibility 
test, were trained in listening, and re- 
tested, the same test being used before 
and after training." 

Different kinds of training were given 
to sub-groups to emphasize one aspect of 
listening experience: 

(a) Control (N = 50). Group received no 
training beyond that incidental to being 
tested twice on successive days. 

(b) General Listening Training (N = 38). 
Group received three hours of training in 
airplane-type noise and that incidental to 


1 The test used to evaluate listening training 
was a set of 8 phonograph records made by 8 
speakers, each of whom read a test list of 24 
words over the hand-held microphone under 
standard test conditions. In another circum- 
stance, 153 student pilots made an average score 
of 57, S.D. 8.5, on the test. 


taking the listening tests. A monitor spoke 
a word over the interphone, “Number 
is ” Subjects wrote the word as they 
heard it. The monitor then held up a card 
containing the test word. Listeners thus 
received confirmation of correct responses or 
corrected wrong ones. The monitor then 
repeated the word to give additional cues 
to its recognition. Approximately 480 words 
were presented in each training hour. Words 
were different from those in the listening 
test and speakers were not the ones who 
recorded the listening test. 


(c) Familiar Speakers (N = 38). Training re- 
ceived by this group was the same as that 
of group b, except that the monitors who 
gave listening training were 4 of 8 speakers 
who recorded the listening test used before 
and after training. 

(d) Familiar Words (N = 38). Training in this 
group was the same as in group b, except 
that training exercises contained 14 of the 
words of the test used before and after 
training. Each test word used in training 
was presented 5 times during the 3-hour 
period. The order of presentation was ran- 
domized with regard to the order in which 
words appeared in the test, and the test 
words were interspersed randomly among 
others used in training. 


Table II shows that increased listening 
ability was evident in all groups, beyond 
that from taking the test a second time, 
that is, beyond the control group; the 
group whose training consisted of fami- 
liarization with one-half of the test 
words benefited most from training. 

The result shown in Table II are for 
the complete test-retest situation. A more 
precise estimate of the effect of familiar- 
ization with speakers’ voices is shown in 
the comparison of gain on the part of 
the test recorded by training monitors 
with the gain on the balanc» of the test, 
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TABLE IL—FInat Status oF Conrrow ListENERS AND LISTENERS TRAINED BY THREE METHODs. 


a. Intelligibility Measurements 


Group N 
(a) Control 50 
(b) General Training 38 
(c) Familiarized with Speakers — 38 
(d) Familiarized with Words 29 


Adjusted Final Mean? 


Adjusted Mean Gain 


b. Statistical Analysis 


Comparison 


General Training minus Control (b— a) 
Familiarization with Speakers minus General 
Training (c — b) 


Familiarization with Words minus Familiarization 


with Speakers (d—c) 


57-5 6.7 
60.4 8.2 
61.3 10.9 
83.4 21.7 
Difference 

1.5 2.82 

2.7 80 

10.8 18.34 


2 This is Final Mean + an amount necessary to adjust for slight differences in initial group 
status. Initial means varied from 50.9 to 55.1 points, the general average being 51.4. r, initial 
vs. final scores, within groups = .83. S.D. (within groups) initial scores, 10.0; final scores, 10.1. 
All final scores were significantly higher than initial scores at the 1% level of confidence. 


TABLE III.—Apjustep Mr&AN GAINS ON FAMILIARIZED vs. NON-FAMILIARIZED MATERIALS WITHIN 
Groups Taucur py Meruops Givinc FAMILIARITY WitH ONE Aspect oF Test MATERIAL. 


Familiarity Condition 


Familiarized with Speaker's Voices (Group c) 


Familiarized with Test Words (Group 4d) 


Mean Gain, Mean Gain, 


Familiar Unfamiliar 
Part of Test Part of Test 
9.9 11.8 
31.6 11.3 


heard only at initial and final testings. A 
similar comparison is made for familiar- 
ization with words. These comparisons 
appear in Table III. 

The final status was approximately 
the same with regard to the words which 
had not been heard in training, regard- 
less of the voices which spoke them 
(Table III). Familiarization with words 
was the factor responsible for the in- 
creased gain in the group which heard 
test words during practice, since this 
group of words accounted for virtually 
all the difference between the final status 
of group c and group d (Table II). 

Since familiarization with words 
showed the outstanding result, more 
details are related concerning familiar- 
ization practice. Each panel heard 480 
words each hour, a total of 1440 word 
presentations. The words with which 
they were familiarized were 96 of the 
192 words used in the listening test. 
Each was presented 5 times during the 


training series, this familiarization ac- 
counting for one-third of the instruc- 
tional time. The fact that familiarization 
words were randomized with regird to 
the other practice words and the order 
im which they appeared in the test lists 
precluded the possibility that the proper 
word might have been recognized, not 
by itself, but by the word which it 
followed. 

in summary, this experiment demon- 
strated several degrees of improvement 
due to listening experience: (a) average 
improvement due to test-retest, 6.7 
points; (b) average improvement due 
to 3-hours’ listening training, 8.2 points; 
(c) average improvement due to 3-hours 
training involving familiarizing listeners 
with % of the test speakers’ voices, 10.9 
points; (d) average improvement due 
to 3-hours’ training, 1% of it involving 
familiarization with 1% the test words, 
21.7 points. 

All improvements, including that due 
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to test-retest, were significantly greater 
than zero. 

The experiment described above uti- 
lized recordings made with the hand- 
held microphone. Since trainees exper- 
ienced greater difficulty in understand- 
ing speech through the throat micro- 
phone, the possibility that listening 
experience with this microphone would 
improve communication efficiency was 
explored. Two groups of trainees were 
given training by listening to recordings 
of word lists. One group (N = 31) 
listened to recordings made with the 
speaker using the throat microphone; 
the other (N = 23) to recordings made 
with the hand-held microphone. 

Recordings for the two groups were 
of the same word lists, spoken by the 
same six trained speakers. Signal strength 
was monitored for each word by means 
of a vacuum-tube voltmeter across the 
cutter terminals. 

The records were made in quiet. 
When played back for listening practice, 
they were “mixed” with the U. S. Navy 
Training Record Precipitation Static 
(headset voltage for speech, 4.2; for 
static, 1.0). Room noise was maintained 
(checked before each session) at 108-110 
db. The listener subjects wrote down 
the words as they understood them dur- 
ing one 50-minute period of training 
per day for five days. Records used in the 
second and fifth periods were equated 
word lists, and responses on these lists 
were used as comparison measures for 
determining improvement in listening 
ability. One practice period was allowed 
so that measured improvement would 
not be due to accustomization to the 
situation. 


Performance was better for the hand- 
held microphones; the gain in listening 
score was significant at the 1% level in 
each case. There was no significant dif- 
ference in the gain due to listening to 
records made with one microphone over 
that made on records made with the 
other, the small observed difference 
favoring the hand-held microphone. 
This does not support the idea that 
accustomization is quantitatively greater 
for the throat microphone; however, the 
increase might mean much in ease of 
communication. It should be noted also 
that some variation in intelligibility of 
the test records may have been present. 
Although word lists were equivalent 
within a small margin and recording 
levels controlled, the completed records 
were not independently checked for 
equivalence. 


In summary, improvement in intellig- 
ibility scores after training may arise 
from several causes, familiarization with 
equipment, situation, or test words, or 
better speaking or listening. Chance 
variations in equivalence of word lists 
used in comparison tests are also factors. 
Insofar as possible, in experiments here 
reported, these factors were controlled; 
the level of noise in the test-room was 
checked periodically and the noise spec- 
trum was monitored by checking separ- 
ately the voltage-output of high- and 
low-frequency components of the output 
to the loudspeakers. 


Since intelligibility gains occuring 
between two one-half hour tests are 
obviously “familiarization” effects such 
score changes should, if possible, be 
eliminated from comparisons (Table 


TABLE IV.—Listeninc Apitity As AFFECTED BY LISTENING PRACTICE. 


Mean Listening Scores 


Microphone N Period 2 Period 5 Difference “i 
Hand-held 23 72.1 83.3 11.2 11.51 
Throat 31 44.1 52.5 8.4 5.86 
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TABLE V.—IncreEases IN INTELLIGIBILITY TrROUGH TEstT-RETEST WITHOUT INTERVENING TRAINING. 


Type of Study ale N Microphone Mean Gain 
Training study 41 hand-held 
Training study 38 hand-held 5.83 
Altitude study 15 throat 4.0 
Articulation study 29 hand-held 3-1 
Average 4-25 


3 Gain was statistically significant above the 5% level of confidence. 


V). This is relatively easily done where 
controls were given the same testing as 
experimental subjects, as in the first 
listening experiment reported. By sub- 
traction, a “net” training gain can be 
computed. Similarly, several training 


opportunity to “learn” the test words 
produced greater gains in intelligibility 
than any other procedure. (Item 3, 
“familiarization ‘net’ gain”, 20.7 vs. con- 
trol, item 7, 6.7 points). In other in- 
stances listening training was inferior to 


TABLE VI.—‘Net” Gain IN INTELLIGIBILITY ScorE DUE TO A 3-Hour TRAINING PERiop (Averages). 


“Net” Gains 


Training Condition (Averages) 
(1) General listening training (N = 38) 1.5 
(2) Familiarization with test voices (N = 38) 14 
(3) Familiarization with words used in test (N = 29) 20.7 
(4) Listening to Trained speakers (hand-held microphone, N = 23) 7-34 
(5) Listening to Trained speakers (throat microphone, N = 31) 1.74 
(6) Speaking training (5 groups, N’s 78-543) average 13.05 
(7) Control groups for items 1-3, tested with same words twice (N = 50) 7 


(8) Control group (4) for items 4-6 tested with equivalent word lists twice (N’s 15-41) 42 


4Net computed by subtracting average “familiarization” value for all speaking-training 
control groups. The situations were alike in that the students had no opportunity to “learn” 


the test words. 


5 Derived from the following data by subtracting mean “familiarization” scores of speaking- 
training control groups. The entire table is included because it is relevant to the topic of 


this summary. 


TABLE VIIL—INTELLIGIBILITY ScoRES OF SPEAKERS BEFORE AND AFTER TRAINING IN AAF VoIce 
COMMUNICATION Courses. 


Type of 

Training Untrained Trained Mean 
Center N Mean S.D Mean S.D. Gain 
Pilot 329 49-3 13.6 67.1 11.3 17.8 
Pilot 128 59-6 13.5 76.7 9-5 17-1 
Gunner 543 67.7 81.7 14.0 
Navigator 229 65.9 13.1 85.4 6.9 19.5 
Bombardier 73 68.6 13-4 84.7 7-9 16.1 


experiments involved controls. By aver- 
aging the “familiarization” gains made 
in these, “net” gain in speaking-training 
is approximated. Table VI lists for com- 
parison these “net” measures in various 
types of listening and speaking training. 

It may be seen in Table VI that an 


speaking training, except where listening 
records employed the hand-held micro- 
phone and trained speakers. This is the 
only instance where a notable net gain 
(7-8 points) in performance on un- 
familiar words resulted from listening 
training. Experimental error is possible 
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in this case due to use of less well- 
standardized tests. Speaking training 
classes averaged 13 points gain for three 
hours instruction. These gains were 
achieved by AAF ground school instruc- 
tors, including in no instance a_ profes- 
sional speech teacher. The experimental 
groups, on the other hand, including the 
listening training classes, were taught 
by experienced teachers who used all the 
devices at their command to maintain 
interest and motivation. Clearly, im- 
proved speaking, not listening, was the 


principal factor accounting for increased 
efficiency through the training in AAF 
communication courses. 

Listening training alone was relatively 
ineffective. The experiments did show, 
however, that familiarity with a special 
vocabulary may be an important factor 
in communication efhciency. The pos- 
sibility of effective techniques for listen- 
ing training is not precluded. To date, 
however, they have not been developed 
and proved to the extent that speaking 
training has. 
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INTELLIGIBILITY RELATED TO ROUTINIZED MESSAGES 


HENRY M. MOSER 
Wapakoneta, Ohio 


© sea papers in this series treat of 
modes of talking that improve com- 
prehension, placing the burden of effect- 
ive transmission upon the speaker and 
assuming passivity on the listener's part. 
In the case for routinized messages the 
listener participates in the improvement 
in that it is his familiarity with the mes- 
sages that accounts for the superior 
efficiency of routinized messages over 
impromptu ones. 


One experiment (reported in a prev- 


ious paper) showed that listener famil-. 


iarity with test words contributed to 
considerable gain in intelligibility scores. 
This increase occurred even when famil- 
iar items appeared in random order 
without associations and mixed with 
unfamiliar items. The gain in the famil- 
iar items was 31.3, while the gain of 
another group, similarly trained, tested 
on unfamiliar items was 8.2. The infer- 
ence for voice communication under 
difficult circumstances is that the speaker 
should express himself in phraseologies 
that have been standardized for the 
specific conditions. Countless circum- 
stances in which experienced airmen and 
control tower operators managed to 
communicate under very difficult cir- 


cumstances can be explained largely in 


terms of their using the familiar message. 


In the early 1940's considerable prog- 
ress was made in standardizing radio- 
telephone messages for airplanes, both 
in the services and in the Civil Aeronau- 
tics Administration. This development 
stopped short of adequate instructional 
methods. Moreover, the established pro- 
cedures were necessarily altered and 
amended frequently in view of more 
and more opportunities and reasons for 
radiotelephone communication. The 


principal tasks, then, with regard to 
incorporating in a training program the 
advantages to intelligibility of the rou- 
tinized radiotelephone message were to 
correlate authorized usages, build class- 
room drills around them, and promote 
and emphasize among the airmen in 
training the importance of using them 
to establish habits of saying and recog- 
nizing correct phraseologies. 

Voice messages over the interphone 
system of the multi-place airplane, how- 
ever, were almost completely unsyste- 
matized in 1943-4.1 The invitation to 


1 That this was so is interesting and under- 
standable only insofar as the facts might bear 
out a plausible explanation that borders on 
fantasy. As soon as the aircrew leaves the ground 
it is a detached unit of men, for the most part 
out of communication with like units or any 
other population. Routinely with flights in this 
country the pilot checks with fixed radio sta- 
tions along his course, and, in line with similar 
conventions among other specialists, he greets 
his fellow fliers along the way with a slight 
maneuver of the plane or a blinking of lights. 
Otherwise, and particularly in talking over the 
interphone, the aircrew carries on its own busi- 
ness, saying to each other what seems appropriate 
to the task of the moment. These messages 
might have remained completely unorganized 
chatter except for a few outside influences. The 
crew members frequently listened in when the 
pilot and the ground stations communicated with 
each other and thus the eavesdroppers picked 
up notions about speaking within the bounds 
of procedure. Consequently, the crewmen de- 
veloped a semblance of a convention that be- 
came known among them as interphone pro- 
cedure. Conversations with the men elicit, and 
with naive certainty, a positive assertion that 
“Interphone procedure is thus and so.” Inter- 
estingly, these “thus and so’s” were almost as 
varied as the number of people questioned—a 
different explanation from each crew. Other in- 
fluences that come to bear on the crew mem- 
ber include comics, motion pictures, and the ra- 
dio—all including from time to time examples of 
crewmen talking to each other. More than one 
gunner cited a popular comic strip as his source 
for interphone usages. Thus the variety of of- 
ficial sounding messages used, or given popular 
presentation, or even appearing in authorized 
training literature, can be accounted for on the 
basis of the assumptions and generalizations of 
experienced technical advisers who had flown in 
different crews and who’ worked independently. 
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recommend interphone voice procedures 
to AAF presented the researchers with 
a task to solve a vital problem unaffected 
by tradition or precept. 


The first step in the study of inter- 


topics, wordings, and relative efficiency 
of current interphone practices. To do 
this magnetic wire recorders were con- 
nected into the interphone lines of 
different types of bombers, and the mes- 
sages that were transmitted on 40 train- 
ing flights were recorded. The crews 
were not told about the recorders, and in 
nearly all cases did not notice them. 
Each recorder operated only when a 
microphone button was pressed, that is, 
when someone was talking. This resulted 
in a continuous recording. On a typical 
4-6 hour training flight the interphone 
was used for a total of approximately 
go minutes. Nearly 2500 transmissions 
were recorded. Analysis showed these 
to relate to over 200 topics of business, 
some of which were specific to each type 
of aircraft. This largely defined the scope 
of the subject matter that had to be 


The same analysis clearly demonstrated 
the need for improvement in interphone 
usage on the basis of intelligibility. Of 
the 1500 messages that elicited a verbal 
response from a listener, 40 per cent of 
the responses were asking in one wording 
or another that the original transmission 
be repeated. 


The second step in the process of 
establishing a procedure was to treat 
with the order and words of the routines 
that necessarily surrounded the message. 
All crew members normally wear head- 
sets, but they may be connected to a 
radio receiver rather than the interphone 
party-line. In addition, several items 
of business must often occur in a short 
space of time. Further, it is desirable 
that the point of origin be known for 


phone communication was to find the © 


accommodated by a set of procedures. . 
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many transactions. These conditions 
indicate the necessity for some additional 
phrases not included in the message 
itself. The following summary sequence 
accommodates these requirements. 


X alerts Y for a message; 


Y signals readiness for message; 
X sends and ends message; 
Y acknowledges message. 


For convenience these parts are named, 
for present purposes, call-up, reply, mes- 
sage, and response. 

Experiments were conducted in which 
alternative wordings and orders of mes- 
sages were tried. Groups of subjects sat 
in a noise-filled room and spoke and 
listened over airplane interphone equip- 
ment. The techniques of the experiments 
are illustrated in the approach to the 
call-up. The survey had showed that in 
practice two-thirds of the transmissions 
were “station name to. station name”, 
for example, “bombardier to pilot”, in 
line with an out-moded radiotelephone 
usage. To determine whether it was a 
most effective -wording, 120 students, 
divided into 12 simulated air crews, 
spoke messages and listened in rotation. 
The messages, 1440 altogether, followed 
in each instance one of four call-ups, 
presented in a counterbalanced manner. 
Two of these gave the name of the 
speaker first, “bombardier to pilot”, and 
“bombardier calling pilot”. The other 
two reversed the order and used the 
connecting words this is and from, “pilot 
this is bombardier,” and “pilot from 
bombardier”. The listeners wrote down 
the message. This technique did not 
show statistically significant differences 
in the effectiveness of the four types, but 
in this and a series of similiar experi- 
ments a consistent advantage resulted 
for the to- and calling-types of call-up. 


Obviously, understandability can be 
improved by the routine use of single- 
purpose words, particularly ones that 
are inherently highly intelligible. Station 
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names would be the most frequently 
used words and should be well-selected 
words. The survey showed that many 
times the wrong party responded to an 
interphone message, and, further, that 
there was lack of uniformity in the 
names given to interphone positions by 
different crews. Therfore, different 
name-words, for the most part ones 


already in use, were tried out for each — 


position in the manner just described. 
For example, the names that were tried 
for one station were ball turret, ball tur- 
ret gunner, belly gun and ball gun. 

The third step in the formulation of 
the procedure was to select words for the 
message. The aims were to reduce trans- 
mission time, to avoid the necessity for 
impromptu phrasing, and to increase 
intelligibility. The method was to study 
the job-analysis recordings for potential 
topics and to accommodate them as far 
as possible with single-purpose and 
highly-intelligible words. For example, 
a multiplicity of terms was in common 
use, the sense of which was to ask for 
repetition of a message. Half of them 
were on the order of what, what did you 
say, and huh. Others conformed to cur- 
rent or outmoded radiotelephone 
phrases, and the remainder were bizarre, 
such as “Put your teeth in so I can hear 
vou.” As many as 20 other situations 
lent themselves to single-purpose words. 
Where possible, the words were taken 
over from radiotelephone language in 
order that crew members who used that 
language would not have to speak two 
sets of symbols on different occasions 
over the same equipment. Other words 
were necessarily coined to fit common 
elements in messages. For example, 
request was applicable to a large propor- 
tion of messages and its consistent use 
would reduce the potential wordiness of 
transmissions considerably. 


‘few adaptations and extensions to use 


Finally, devices were borrowed to im- 
prove intelligibility through the manner 
in which words were presented. First, 
numerals were used alone or in combi- 
nation to express time, heading, fuel 
supply, radio frequency, altitude, type 
of aircraft, amount of film, speed, dis- 
tance, etc. For the most part their pro- 
nunciation and the ways of coupling 
them were already prescribed for radio- 
telephone. It remained only to make a 


the system very beneficially in inter- 
phone. In the same way an existing pro- 
cedure for identifying letters of the 
alphabet was directly applicable. It pro- 
vided assurance that the minute dis- 
tinctions between words could be trans- 
mitted clearly. 


In large measure this paper indicates 
how the principles of routinized pro- 
cedures were effected in one instance. 
First, topics of commuication were 
ascertained and then a code of rules 
was drawn up to accommodate the topics. 
It goes without saying that the code, 
to be effective, had to be given to stu- 
dents in a manner and to a degree that 
made its use habitual. 


Over-shadowing the illustration is the 
demonstration that routinized messages 
are necessary when communication in 
noise is vital; also, that the standardized 
message serves to expedite communica- 
tion and to reduce verbose wordings. 


In viewing routinized messages as an 
aproach to intelligibility in telephony, 
it is emphasized that both the listener 
and the speaker must become familiar 
with rules and practiced in using them. 
Otherwise the listener hears more than 
average only to the extent that the mes- 
sage contains words that are inherently 
more intelligible than impromptu 
phrasings. 
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INTELLIGIBLITY RELATED TO ARTICULATION 


GAYLAND L. DRAEGERT 
Purdue University 


aircraft communication 


in which 


either the speaker, the listener, or both, 
are surrounded by high-level noise, the 
sense of a message may depend upon 
the transmission of one consonant sound. 


For example, in a traffic-jam the differ- 


ence between You are not cleared to land 
and You are now cleared to land is small 
in an articulation sense and large in a 
view of operations and meaning. In- 
stances of time-consuming and_ nerve 
straining misunderstandings are mani- 
fold in recordings of intercommunica- 
tion in planes during air-crew training. 
Consequently the problems posed for 
study were whether the difference be- 
tween nermal and superior articulatory 
practice was important under the cir- 
cumstances of airplane communication, 
and, second, if so, could articulation 
patterns be changed in a short training 
period. 

Four experiments, three of which were 
preliminary, were designed to investigate 
the effects of training in pronunciation 
or articulation. Results indicated that 
training for one hour in clear pronuncia- 
tion affected a subsequent intelligibility 
test, not only in the scores obtained by 
the speakers, but in the degrees to which 
judges heard words as being articulated. 

In the first experiment, three groups 
of approximately 30 subjects each were 
trained for 20-30 minutes as an exped- 
iency measure to stress only final conson- 


ants. Results showed a significant gain in 
intelligibility when the hand-held 
micrcphone was used in the pre- and 
post-training tests. Other groups, using 
throat and mask microphones, showed 
no gain in intelligibility (see Table I). 

Another rule of thumb, namely that 
intelligibility depends upon stressing 
sibilant sounds, was widely accepted and 
lent itself to experimental study. Two 
groups of go students each read lists of 
24 common werds, normally, using the 
hand-held microphone. One group re- 
peated the reading without further 
instruction. The other practiced for 20 
minutes under supervision the stressing 
of sibilants, and then read test lists a 
second time. Monitors were satisfied that 
the subjects were stressing the sibilant 
sounds in the tests, and also observed 
that the result was in many instances 
distorted word patterns. The experiment 
produced no changes of significance and 
cast doubt cn the importance of the topic 
as related to intelligibility (Table II). 

Another group of subjects (28) was 
given training in clear prcnunciation in 
two 30-minute periods. The training 
was in a quiet room and over the inter- 
phone, using the hand-held microphone. 
In contrast with a control group, the 
trained subjects showed significant gains 
in intelligibility (Table III). The re- 
sults of this preliminary experiment, 
although positive. were not conclusive. 


TABLE I.—INTELLICGIBILITY OF STUDENTS READING ONE-SYLLABLE Worps NORMALLY AND WITH 
SPECIAL STRESS ON FINAL CONSONANTS. 


S.D. 

Normal Stressed Within 
Microphone N Mean Mean Difference Groups a 
Hand-held (T-17) §0 50.5 56.5 6.0 10.2 4.26 
Throat 32 39.2 40.9 0.7 9.9 0.50 
Insert (A-10 Mask) 27 56.2 52.9 —3.3 11.2 1.61 


“t's” from distributions of differences, 1% = 2.75 
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INTELLIGIBILITY RELATED TO ARTICULATION 


TABLE ANp Frinat INTELLIGIBILITY SCoREs OF CONTROL SUBJECTS AND SUBJECTS 
TRAINED IN STRESSING SIBILANT SOUNDS. 


Group N 
Experimental 29 
Control 29 


Weighted Ave. $.D. within groups = 10.9 


Initial Final 

Mean Mean 
54-2 53-9 
51.7 54.8 


TABLE Finac Scores oF ConTROL SUBJECTS AND SUBJECTS 
TRAINED IN CLEAR PRONUNCIATION (in quiet). 


Initial Final 
Group N Mean Mean i 
Experimental 28 56.0 60.14 6.90 
Control 30 57-0 57-3 
“t” from distribution of differences, 1% = 2.77 
Weighted Ave. S.D. within groups = 103 | 


for the possibilities remained that 
articulation in itself was not affected and 
that the training produced important 
changes in loudness level. 

Following the indications of the pre- 
liminary experiments a larger study was 
planned to find (1) the effect of articu- 
lation training on intelligibility test 
scores, (2) the effects on loudness level, 
and (3) the effects on degree of articula- 
tion as observed by judges. 

One experimental group of 37 student 
pilots was tested, trained for one class 
hour in clear pronunciation of words, 
and retested. Thirty-eight other students 
were used as a control group. Two types 
of measure were used, a standardized 
word intelligibility test. (multiple-choice 
forms) and a special list of twelve 
“difficult” one-syllable words. The latter 
was a write-down exercise in which all 
words contained one or more of the 
sibilant sounds and were of low intellig- 
ibility value. The same form of this 
special exercise was used both initially 
and finally, specifically to provide pre- 
and post-training recordings of the same 
words. Phonograph recordings were 
made of initial and final performances 
for the speakers on one circuit. 

Training given in articulation covered 


one 55-minute class period on the day 

between initial and final tests. The 

training hour included: 

1. An instruction period on the need for clear 
articulation when speaking in noise (5 min- 
utes). 

2. Examples of words containing sounds of low 
intelligibility value (2 minutes). 

g. Drill in noise with lists of low intelligibility 
words (37 minutes). 

Demonstration recording, “Articulation,” 

played through subjects’ headsets (6 min- 

utes). 

Drill with interphone messages in noise. The 

monitor criticized unclear speech (5 min- 

utes). 


Throughout the teaching session, all 
criticisms were directed toward clearness 
of speech and articulation of sounds. No 
mention was made of rate of speaking, 
loudness, or rhythm. The method for 
drill with difficult words was for one 
man to read a word and then designate 
a listener to read back the same word. 
If the word was incorrectly received, the 
speaker pronounced it again. If the 
monitor perceived inadequate articula- 
tion, he volunteered a corrected pattern 
for the faulty words. Each speaker read 
12 different words. 

The trained group improved 7.4 in 
average score on the multiple-choice 
test and 8.1 in the percentage of difficult 
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words correctly heard (Table IV). The 
control group showed no gain on the 
multiple-choice test, but gained 6.1 
points with the difficult words. In asses- 
sing the gains in understanding the 
difficult words weight must be given to 
the familiarity factor, since identical 
lists were used initially and finally. The 
gain of the experimental group on the 
standardized test indicates improved in- 
telligibility on words in general. 


TABLE IV.—INTELLIGIBILITY OF 
(Training Conducted in Noise). 


SPEECH MONOGRAPHS 


presentation was randomized. Volume 
of re-recording of pre-training and post- 
training samples was equalized by dry- 
run monitoring with a cathode-ray 
oscilloscope, this before the pair of 
stimuli was recorded. All the judges 
were able to detect the effect of training 
upon articulation to a significant degree. 
(Median of proportion of correct judg- 
ments, 72%). Although the 12 speakers 
taken as a group were judged to be 


TRAINED AND CONTROL SPEAKERS. 


“t’s” from distributions of 


Test and Initial Final 
Group Condition Mean S.D. Mean S.D. Difference ar 
Trained 37 58.8 8.6 66.2 78 7-4 3-77 
Control 38 63.4 8.7 64.1 7-4 4 Ro 
Difficult Words 
Trained 37 38.2 11.3 46.3 10.2 8.1 4-39 
Control 38 38.8 12.8 45-0 9.7 6.1 3-30 


differences, 1% = 2.75 


The deflections of a VU meter con- 
nected across the headset circuit were re- 
corded for all of the test items. Mean 
values showed no significant differences 
in VU readings between either the pre- 
and _ post-training loudness levels or 
between the trained and control sub- 
jects. 

From the recordings of 12 subjects of 
the experimental group, re-recordings of 
the lists of difficult words were made in 
such a manner that initial and final 
speaking of each word by the same 
speaker were paired. These pairs of re- 
cordings of the same word, spoken be- 
fore and after training, were played back 
to a panel of nine judges, four of whom 
were specialists in speech. The order of 


TABLE V.—Proportion oF Worps SPOKEN WitH More ARTICULATION AFTER TRAINING.! 


changed by articulation training, some 
speakers did not change to a discernible 
degree (Table V). 

Three of the four experiments re- 
ported in the study showed significant 
effects of training in precise articulation 
upon intelligibility. All involved the 
hand-held microphone. 

Gains in intelligibility were not due 
to increased loudness of speech signal 
which might be expected to accompany 
added care in articulation. 

From phonograph records of speech 
samples before and after training, judges 
were able to select the speech of trained 
speakers, using as a criterion the degree 
to which they articulated test words. 

Results of the experiments indicate 


N = 12. 


Speaker 


Per cent of words more articulated 
after training! 


94 88 88 85 83 81: 79 56 49 46 46 39 


7-8 pairs of words per speaker. 


1 Degree of articulation is determined by composite score of a panel of g judges listening to 
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that training for good articulatory pat- 
terns is likely to improve intelligibility 
and is superior to training which at- 
tempts to improve the pronunciation of 
“difficult” sounds only. 

In two cases where general patterns 
of clear speech were taught, the one 


involving instruction in noise was the 
more effective. Too many conditions 
were different in the experiments to 
attribute the greater gain to the presence 
of noise alone, but the presumption is 
that noise used during training was 
helpful. 
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UNDERSTANDABILITY OF SPEECH IN NOISE AS AFFECTED BY 
REGION OF ORIGIN OF SPEAKER AND LISTENER 


HARRY M. MASON 
Purdue University 


HE 8-12 men who in intelligibility 

testing spoke and listened to each 
other over the same party line may have 
come from widely separated regions of 
the country. Their scores offered an op- 
portunity to determine whether men 
who came from the same region under- 
_stood each other better than did men 
whose home regions were farther apart, 
and to compare the over-all speaking and 
listening efficiency of men from various 
parts of the country. Although this was 
an interesting and potentially important 
problem in terms of selection of person- 
nel, facilities could not be diverted from 
more urgent tasks of training to investi- 
gate it thoroughly. A gross approach 
was possible, however, by analyzing the 
intelligibility test answer sheets. 

The Service Command in which each 
trainee was inducted into the army could 
be identified by Army Serial Numbers. 
This designation was used as the man’s 
region or home location. No estimate 
of the degree to which men might have 
been inducted away from the area where 
they formed their speech habits was at 
hand, but this source of error was not 
considered great. A more serious limita- 


tion was that the nine Service Com- 
mands represented large portions of the 
country and included in the same Com- 
mand regions with different speech pat- 
terns. For example, the Eighth Service 
Command contained localities as far 
separated as Brownsville, Texas, and the 
Ozark region or Gallup, New Mexico, 
and New Orleans.* 

The intelligibility tests for 1913 men 
were analyzed. For present purposes each 
speaker was assigned an average intellig- 
ibility score on the basis of all the men 
from a given Service Command who 
heard him; thus each man had as many 
separate intelligibility scores as _ there 
were Commands represented among his 
listeners as well as the score assigned 
him by the whole panel. The scores of 
the speakers from each Command were 
sub-divided according to Service Com- 
mand of listener. Means were then in- 
spected to determine how well speakers 
from each region were heard by listeners 
from all Commands. Since neither time 
nor facilities permitted tests of signifi- 
cance for all the possible differences, a 
preliminary inspection was made to dis- 
close differences which would probably 


*Service Command States 

1 Maine, Vermont, New Hampshire, Massachusetts, Connecticut, 
Rhode Island. 

2 New York, New Jersey, Delaware. 

‘ Pennsylvania, Maryland, Virginia. 

4 North Carolina, South Carolina, Georgia, Florida, Alabama, 
Mississippi, Tennessee. 

5 Kentucky, West Virginia, Ohio, Indiana. 

6 Michigan, Wisconsin, Illinois. 

7 North Dakota, South Dakota. Wyoming, Nebraska, Colorado, 
Kansas, Missouri, Iowa, Minnesota. 

8 Texas, Louisiana, Arkansas, Oklahoma, New Mexico. 

9 Montana, Idaho, Utah, Arizona, Nevada, California, Oregon, 


Washington. 
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be both statistically significant and 
meaningful. This inspection presumed 
that a listener should be more effective 
in understanding the men of his own 
Service Command than in understanding 
those from other regions. On the other 
hand, if listening ability differed among 
the Commands, superior listening could 
be identified as hearing the speakers of 
a Command better than did the speak- 
er’s co-residents. The variances and 
N’s involved in sub-groups indicated 
that differences above three points were 
probably significant statistically. Ac- 
cordingly, regions likely to exhibit signi- 
ficant differences in understandability 
were selected as those where listeners of 
one Command heard the speakers of 
another less well by at least three points 
than did the speaker’s co-residents. (Dif- 
ferences favoring the away-from-home 
listeners by the same amount were also 
investigated for significance.) 


In making these tests of differential 
intelligibility, only speakers who were 
heard by listeners from both regions in 


question were considered. This restric- : 
‘ have been either entirely or partly different 


tion was imposed to make sure that com- 
parable test situations were present. The 
situations were comparable because they 
were identical for listeners from both 


commands (also test lists, testing rooms, 
and noise conditions) .* 

Differences which indicate that speak- 
ers heard by both their own co-residents 
and residents of another command were 
reliably better understood by their own 
co-residents are shown in Table I. Be- 
cause of the critical test of significance 
used, and because the preliminary win- 
nowing of differences may have missed 
some significant comparisons, the values 
of this table give an under-estimate, 
rather than an over-estimate, of the 
degree of differential understandability, 
based on these crude regional identifi- 
cations. 


It may be observed in Table I that all 
significant differences in intelligibility 
involved a Northern and a Southern 
Command, except where the Sixth Ser- 
vice Command was concerned. Those 
involving this Command may have oc- 


1 In the original test distributions, from which 
the means in Table II were computed, this was 
not true, since the listeners from Command A 
might have heard one group of speakers from 
Service Command B, while the speakers from B 
heard by listeners from Command C might 


individuals, tested at different times. The pos- 
sibility that refinements in testing procedures, 
coupled with changes in regional distribution 
of trainees, would give spurious differences was 
thus eliminated from these comparisons. 


TABLE I.—SIGNiFICANT DIFFERENCES BETWEEN UNDERSTANDABILITY OF SPEAKERS TO THEIR SERVICE 
COMMAND RESIDENTS AND RESIDENTS OF OTHER COMMANDS.2 


Service Command 

which heard less 
Speaker's well than speaker's 
Service Command 


+ OO 
“1008 


Difference in 
Mean Intelli- 


gibility N 
3-7 38 5.0 
8.0 20 2.7 
7-7 22 2.5 
4-0 120 2.1 
3-7 102 2.4 
3-2 209 3.0 


2 Two significant differences occurred where another Command's listeners heard the speaker 
better than his own co-residents. Both involved the Sixth Service Command as superior listeners. 
Speakers were from the First and Seventh. The possibility of listening superiority of this Com- 
mand being a causal factor in the differences where it is concerned should be kept in mind 


in interpreting results. 


2 All “t’s” are significant at the 5% level. The first (5.0) and the last (3.0) are significant at the 


1% level. Median correlation between scores given the same speakers by two Commands, r = .71. 
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curred due to its generally superior 
listening performance. 

Table II presents the mean intelligibil- 
ity scores given to speakers from each 
Command by listeners from each Com- 
mand. Although any single comparison 
of means in this table might be affected 
by a freakish combination of good 
speakers from one Command paired 
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with poor listeners from another, and 
vice versa, the contribution of each abil- 
ity to total variation among Commands 
can be surveyed by analysis of variance. 
The table also yields some comparisons 
of interest from inspection. Table III 
presents the results of an analysis of 
variance performed on the means of 
Table II. 


TABLE II.—LIsTeNnING-SPEAKING ABILITIES OF SERVICE COMMANDS AS SHOWN BY AVERAGE SPEAKING 
ScorE ASSIGNED SPEAKERS FROM EACH COMMAND BY LISTENERS FROM EACH COMMAND. 


Service 


Average Score Assigned by Service Command Listeners Mean 

Command Speaking 
of Speaker 1 2 3 4 5 6 7 8 9 Performance 

1 49-7 47.2 50.0 53-3 50.8 60.9 47-6 44.6 53,1 50.8 

2 57-2 50.3 52.6 49.0 54.2 52.9 51.9 48.5 50.9 51.9 

3 546 45.3 48.7 45.2 480 50.1 476 458 49.2 48.3 

4 542 49-00 49-5 59-8510 553542 52.0 53.0 

5 48.2 48.6 46.3 45.6 50.2 51.3 47-6 45-1 50.9 48.2 

6 54.0 51.9 50.3 52.4 52.8 54.0 57-4 50.5 52.7 52.9 ° 

7 458 48.7 482 523 47.7 526 47:2 45-7 498 48.7 

8 46.0 465 474 469 484 489 453 476 49.6 47.4 

9 48.1 52.6 48.5 44.7 48.6 51.2 49-2 48.3 49-1 48.9 
Mean Listening 
Performance 50.9 48.9 49.1 49-9 50.2 53-1 49-8 47.5 50.8 50.0 


Speaker N from a Service Command being heard by listeners from a given Command ranges 
trom 21 to 278 


TABLE III.—LIstTENING-SPEAKING PERFORMANCE OF SERVICE COMMANDS WHEN ALL Groups 
ARE CORRECTED TO THE SAME SIZE, 


a. Summary of Analysis of Variance “5 
Source df Variance 
Means of Service Command Speaker Performance as heard by Listeners =” 
from each Command 80 11,38 
Listening Performance Means 8 21.37 
Speaking Performance Means 8 42.80 
Listening X Speaking 64 6.20 
F, Listening Performance/Listening X Speaking = 3.45 
F, Speaking Performance/Listening X Speaking = 6.go 


F (1%) = 2.79 
b, Mean Listening and Speaking Achievements in Rank Order 
Listening Performance Speaking Performance 


Service Service. 
Command Mean Command Mean 
6 53.1 4 53.0 
1 50.9 6 52.9 
9 50.8 2 51.9 
5 50.2 1 50.8 
4 49-9 9 48.9 
7 49.8 7 48.7 
3 49-1 3 48.3 
2 48.9 5 48.2 
8 47-5 8 47-4 


General Mean = 50.01 

Standard error of any difference in means is 1.18; differences of 3.23 or larger are significant 

at the 1% level of confidence. Differences of 2.61 or larger are significant at the 5% level of 
confidence. 

Rho, Speaker vs. Listener Ranks of Service Commands 
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The Sixth Service Command’s listen- 
ing performance is substantially better 
than that of its nearest competitor's; 
performance of listeners in the Sixth, 
First, Ninth and Fifth are all better than 
that of the Eighth, at the five per cent 
level of confidence. 

Although listening performance of the 
Commands varies significantly, it is not 
so great a contributor to variability be- 
tween Commands as difference in speak- 
ing performance. This is shown by the 
greater F-ratio associated with speaker 
variarce as compared with that assoc- 
iated with listener variance (Table 
IIl-a). In the right-hand section of 
Table III-b, speaking performances are 
shown in rank order. These ranks are 
assigned on the basis of equal represen- 
tation of each Command on alFlistening 
panels. Another set of ranks of speaker 
performances of Commands is given in 
Table IV. This shows rank order in 
terms of scores actually achieved by 
speakers, without weighting to remove 
the effect of varying representation in 
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listening panels. Both methods of com- 
putation place the Fourth and Sixth 
Commands first, and the Eighth last in 
speaker intelligibility. It should be 
recognized that these comparisons could 
have been affected by shifts in trainee 
population in conjunction with changed 
test conditions. Due to lack of time for 
analysis, no check of this possibility was 
made. 

Table II, which gives a matrix of 
average speaking and listening scores for 
the Commands, yields some information 
by inspection. The Sixth Service Com- 
mand’s speakers were heard better than 
speakers in general by listeners from 
every Command. In contrast, no Com- 
mand’s listeners heard the _ Eighth’s 
speakers as well as speakers in general. 
The Fourth Service Command, which 
has high overall intelligibility, is heard 
well by some Commands, but slightly 
below average by two Commands. Due 
to the large number of factors involved 
in these scores, and the consequent pos- 
sibility that spurious differences might 


TABLE IV.—MEAN INTELLIGIBILITY SCORES OF STUDENT PILoTs BY SERVICE COMMAND OF 
SprakeR (In Rank Order). 


a. Intelligibility Scores 


Service Command Mean Intelligibility N 
6 52-3 317 
4 51.7 110 
1 50.9 56 
9 50.4 81 
2 49.8 99 
3 49-4 196 
5 49-2 243 
7 478 382 
8 47-0 329 
Total 49-4 1913 


Weighted Average $.D. = 15.29 


b. Statistical Analysis 


Service Commands (Most Mean 

Intelligible listed first) Difference 
6—8 5-27 
6—7 4-47 
6— 5 3.10 
6—3 2.90 
4—8 4.64 
4—7 3.84 
g—8 3-36 


(1%) 2.58 
“t” (5%) = 1.96 
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2.607 
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occur, no Statistical evaluation of the 
reliability of these differences was 
attempted. They may offer some cues for 
further checking, however. 


In most of the cases where significant 
differential understandability es- 
tablished, a Northern and a Southern 
Command were involved. Real differ- 
ences in speaking ability for areas are 
likely, but not definitely proved. The 
Fourth and Sixth Service Commands 
averaged high in speaking ability, the 
Eighth, low. The Sixth Service Com- 
mand furnished outstandingly good 
listeners. Educational factors, perhaps 
accounting for some of these differences, 
should have been minimized since only 
familiar one- and _ two-syllable words 
were used in tests, and misspelling was 
not considered an error, if the oral word 
pattern was established by the spelling. 


The findings of this study apply to 
men chosen as pilots and air crewmen by 
AAF. Even though educational and 
alertness factors were not ruled out as 
related to differences in speech and 
listening habits, men were a cross sec- 
tion of the able, alert youth assembled to 
work together on a critical war mission. 
Important to this mission was ability to 
understand one another by means of 
speech signals under difficult noise con- 
ditions. Whatever the conditions respon- 
sible, the existence of regional peculiar- 
ities of speech within the United States, 
detrimental to universal understand- 
ability in such a group, is believed to be 
established. If more precisely defined 
speech regions formed the basis for 
analysis instead of Service Commands, 
the presumption is that greater differen- 
tial understandability would be dis- 
covered, 
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TRAINING PROCEDURES 


FOSTER C. SHOUP 
General Motors Institute 


Tt report explains the training 
program in which the results of the 
researches were utilized. It presents a 
description of the training room, an ex- 
planation of the role of the instructor, 
and a general view of the instructional 
materials. Since the instruction was more 
effective when given in high-level room 
noise (100-110 db) a small room was 
desirable. It was desired that students 
spend a high proportion of class time 
in voice drill and that they hear record- 
ings of their transmissions. The airplane 
interphone amplifier lost efficiency when 
fed to more than 12 headsets. 


The voice-communication trainer that 
was designed to fit the circumstances 
consisted of 48 positions or stations for 
students. Each station contained the two 
most commonly used microphones 
(hand-held and throat), jackbox, and 
headset. Stations were connected to make 
four separate circuits or party lines. The 
12 stations on each circuit were spaced 
about a single table and separated by 
dividing boards that extended approx- 
imately 12 inches above table level. The 
four circuits were wired into a specially 
designed control panel, the central piece 
of equipment at the instructor’s table. 


The instructor occupied a fifth table 
in the room. His equipment included, 
in addition to the control unit and air- 
plane-type noise generator, a 50-watt 
amplifier connected between the gener- 
ator and four speaker-reproducers that 
were spaced about the room, two voice 
recorder-reproducers, and two db meters. 
This equipment operated flexibly 
through the control panel with patch 
cords and switches so that the instructor 
could listen and talk to any station or 
record and play back any. message. He 


could also read from the meters the 
effective loudness level of any student's 
transmission. 

Figure I shows a section of the class 
room and the equipment at the instruc- 
tor’s table. 


The role of the instructor is inferred 
from the preceding description of the 
trainer. His job was largely to keep stu- 
dents talking and listening over the 
interphone equipment, that is, to keep 
voice drills going. Instead of lecturing 
about how to use the microphone or 
how loudly to talk, or explaining cor- 
rect message phraseology, he listened to 
the drills, interrupting the speaker with 
a request to talk louder or slower, to 
hold the microphone differently, or to 
change the message phraseology to con- 
form to standard forms. 


The courses that were used in various 
training environments were similar, but 
not identical. Approximately 20 cur- 
ricula were drawn up to provide each 
air-specialist with materials that as 
closely as possible represented the voice 
messages and communication skills that 
he would use in his job. These courses 
were usually four hours long. In this 
respect there were some compromises, 
more often than not resulting in longer 
courses. In one extreme a 24-hour 
course was requested and provided. In 
others, the voice training was combined 
naturally with existing courses, for ex- 
ample radio operation. Such combina- 
tions were especially fortunate for they 
placed voice training in a related con- 
text. Four hours, however, was the rec- 
ommended time for teaching intelligi- 
bility. 

The first 30-40 minutes of the training 
was devoted to a 12-item write-down or a 
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FIGURE 1.—VIEW OF TRAINING ROOM AND OF INsTRUCTOR's STATION - (inset) 
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24-item multiple-choice word intelligi- 
bility test. The tests were graded immed- 
iately, with the class participating, 
and each member became aware of his 
score and his standing in his class. He 
thus experienced the problem of com- 
municating in noise, received a measure 
of his proficiency, and was usually moti- 
vated for the remainder of the short 
course. 

Following the test in the first period, 


and at the outset of each remaining 
class period, the students heard a five- 
minute demonstration recording played 
to them through their headsets. These 
recordings explained — largely through 
examples of good and poor usages—the 
principles of speaking for intelligibility. 
They were partially dramatized with 
dialogue and sound effects. The follow- 
ing excerpt from one of the training 
records is characteristic: 


DEMONSTRATION RECORDING: RATE | 
Speaker 1: You have learned that loudness of speech is important for intelligible 
aircraft communication, but loud voice alone will not guarantee reception 
of your message. Listen to this speaker: 
Speaker 2: Army 2784, position over Lebanon, time 1247 at 4000, cleared to enter traffic 
(Loud, but pattern at 1500, traffic west, runway 27, over. 


very fast) 


Speaker 1: Even if you expected that message, you would not understand it. Such a 
fast rate of speaking, especially in airplane noise, produces unintelligible 


messages. 


A primary consideration in using the 
recordings was to curtail instructors’ lec- 
tures which could occur only at the ex- 
pense of voice drills. Further, in early 
training experiments in the Laboratory 
there was always an instructor variable, 
one being more effective than another. 
This was accompanied by different 
amounts of lecture-explanations and dif- 
ferent points of emphasis. The prob- 
ability of greater variability among 

Third Hour 

I. Objectives. At the conclusion of this les- 

son each student should: a. know the im- 
portance of slow, clear speech; b. have 
practiced slow, clear speech; c. have prac- 
ticed using interphone messages. 


III. Lesson Procedure Time 
A. Demonstration 10 
B. Drill 40 


1. Voice, rate. 


2. Interphone messages, aircrew mem- 
bers. 


ground school instructors was assumed. 
The demonstration recordings both 
standardized content and saved time 
while providing exemplary messages per- 
tinent to the voice drills for the hour. 

A typical class hour is exemplified in 
Hour 3 of a course for bombardiers. In 
Part III of the outline the left-hand col- 
umn shows the lesson procedure in a 
time sequence; the second column, the 
time alloted to each procedure; next, 


(Bombardiers) 
II. Instructional Aids 
A. Demonstration recordings. “Rate” and 
“Pronunciation.” (Non-Pilot) 
B. Prepared sheets for drill 2, items 1-12 
inclusive. 
C. Prepared sheets for drill 5. 


Play demonstration record- Vol. I, 
ings, “Rate” and “Articula- Sec. V 


tion” (Non-Pilot). for scripts. 
Use T-30 microphones. Drill 2, 
Student monitor on each items 1-13 
circuit to time messages. inclusive. 


Record and playback por- 
tions. Drill 5. 
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directions for the instructor; and the 
right-hand column, references to the 
sources of the materials in manuals that 
he has. 


In the speaking exercises students on 
each party line worked as a unit. In most 
cases the sequence of messages in an ex- 
ercise made an entity, for example the 
radio and/or interphone messages of a 
simulated flight. As one person spoke, 
the others on the line listened. Fre- 
quently the message designated who of 
the listeners was to respond. Each exer- 
cise had a dual purpose, to establish 
habits for intelligibility and to estab- 
lish habitual use of routinized message 
forms. 


The following excerpt from a drill 
shows how some of the materials were 
given to the instructor. He, in turn, 
reproduced them in usable form for 
the individual stations. 


over. 
Station 7: Army 3459, Roger, out. 


Hour II: Continued loudness drill; demonstra- 
tion recording, rate; rate drill. 

Hour III: Demonstration recording, articula- 
tion; articulation drill. 

Hour IV: Demonstration recording, accustomed 
patterns of speaking; procedure re- 
view drill; intelligibility test. 

This resumé has pertained to training 
procedures that were worked out for air- 
specialist training, that is, bombardiers, 
navigators, pilots, etc. in their own 
schools. Another and equally extensive 
program was developed for air-crew 
training, the bomber team. The training 
equipment, particularly the control 
units, was more compact than the one 
described above. The number of student 
stations at a training table varied ac- 
cording to the size of the bomber crew. 
The training procedures differed from 
those outlined above chiefly in that the 
training was carried on with the pilot 
as the instructor. Otherwise, and except 


Dritt 25: R/T Postrion Reports 
This drill is a simulated airway flight. Plane number 3459 is cruising at 200 miles per hour. 
Each position report should include in order, call, position relative to check point, time, altitude, 
flight conditions, ETA next check point or destination. 
Example 

Station 1: Memphis Radio, this is Army 3459, over. 
Station 7: Army 3459, this is Memphis Radio, over. 
Station 1: Memphis Radio, this is Army 3459, 15 miles northeast Memphis, 

time 0935 at 4000 on instruments, estimate Texarkana time 1055, 


Station Station 
Acting Acting Range Enroute Additional Information 
As Plane As Range Station 
1 7 Memphis Nashville to Dallas 260 miles to Texarkana 
2 8 Texarkana 100 miles to Dallas 
3 9 Dallas Request permission to 


contact Love Tower 


The last half-hour of the course, like 
the first, was given over to intelligibility 
testing, this test being similar to the 
earlier one in type and difficulty. 

A generalized summary of the four 


training periods follows: 
Hour I: Intelligibility test; demonstration re- 
cording, loudness; loudness drill. 


that the contents in this program were 
entirely related to crew interphone mes- 
sages, the two were similar. 

In summary, the training procedures 
in voice communication were charac- 
terized by directed drill over service 
equipment in the presence of high-level 
noise. Realism in course content and 
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training environment was approached. 
The instructor was guided largely by 
objective criteria in his role of critic. 
For example, if the meter showed low 
voice level the cause was _ insufficient 
loudness, poor microphone position, or 
more than one open microphone on the 
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circuit. The cue for “speak more dis- 
tinctly” was the inability of a fellow lis- 
tener to repeat a message that had just 
been given. The training called for max- 
imum student performance and required, 
as a minimum on the part of the in- 
structor, interest and alertness. 


Wen 
4 


aoe paper presents measurements 
relating to the effectiveness of the 
training program in voice communica- 
tion. The scope is limited to the effects 
of voice training as measured by intelli- 
gibility tests. It does not take into ac- 
count improvement resulting from bet- 
ter handling of the equipment or from 
use of regularized message forms. By 
assumption, increased _ intelligibility 
might be expected to follow (1) experi- 
ence in communicating in flight, or (2) 
classes in which students spoke and lis- 
tened at will over electrical communi- 
cation equipment in noise, or (3) prac- 
tice with voice recording-reproducing 
equipment. Motivating elements char- 
acterize each instance sufficiently to keep 
students talking “just for fun.” In turn, 
all were relatively ineffective in produc- 
ing improvements in intelligibility. 
The effects of experience alone in 
changing speaking ability are shown in 
Table I. Groups of instructors whose 
jobs necessitated teaching in the air, 
combat returnees, students in training, 
and men awaiting training—all stationed 
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tion of a tour of combat duty followed 
by re-assignment to this country. In 
spite of these differences in experience, 
there were not significant differences in 
intelligibility among the groups, the 
cadets in training being on the average 
as intelligible as the men who had re- 
turned from combat.' The conjecture 
is tempting that experience only rein- 
forced original practices. A survey of 
the frequency with which interphone 
messages had to be repeated in order 
to be understood further supports the 
idea that experience alone does not pro- 
vide for adequate voice communication 
skill. Among aircrews fully trained and 


ready to be sent to combat theatres, it 


was found that 40°, of interphone mes- 
sages were repeated one or more times. 
Many more were never acknowledged as 
received. 

In an experimental voice training sit- 
uation, focusing the attention of a class 
upon voice factors was relatively in- 
effective. This is not to say that there 
was no significant improvement; but the 
average increment was considerably less 


TABLE Scores FoR INstRUCTORS, COMBAT RETURNEFS, CADETS, AND PRE-FLIGHT 
STUDENT NAVIGATORS. 


~ Group N Mean Intelligibility Score oM 
Combat Returnees 20 60.2 2.6 
Cadets 78 58.7 1.8 
Pre-Flight Students 38 60.3 1.9 
Total 163 60.7 


at one center—were tested for intelligi- 
bility under the same circumstances. 
Their experience in flight and, conse- 
quently, in using communication equip- 
ment ranged from none to the large 
amounts represented by long periods in- 
structing in an airplane or by comple- 


1 The scores of Table I are consistently higher 
than comparable ones from the Laboratory. 
They were obtained in a training center with 
a standard voice communication trainer. In 
this instance the average noise level was 7 db 
lower than the one in the Laboratory. Sim- 
ilarly, other mean intelligibility scores varied 
from one installation to another depending 
principally upon the noise level in the testing 
room, largely a function of room size. 
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than that resulting from fully-developed 
training procedures. Table II shows the 
gains in intelligibility of four groups of 
student pilots, three of which were 
trained for 7-10 hours with different em- 


TABLE II.—Errects or “Brest-Gurss” 
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attention was given to voice refinements, 
for example articulation. Second, the 
recommended loudness level was too low 
and was treated unreliably in criticisms 
by instructors. Third, there was too 


TRAINING ON INTELLIGIBILITY SCORES. 


Initial 

Group N Mean S.D. Gain S.D. 
Control (with slight practice) 9-3 10.2 
7-hour training 68 46.0 11.3 6.2 12.9 
10-hour training, last 3 hours devoted to 

pronunciation 58 50.7 12.2 12.0 13.2 
10-hour training, last 3 hours devoted to rate and 

loudness 62 54.1 11.4 13.9 9.2 
Total 243 49-5 10.2 

F (differences between gain scores, 5% = 2.65) = 1.12 


phases, and the remaining one tested, 
and, after a lapse of time comparable to 
the trained groups, re-tested. This sec- 
tion was a control except that after the 
first test the members were permitted to 
talk at random over an interphone train- 
er for a few minutes. All sections gained 
significantly, at least at the 5 percent 
level of confidence. However, there were 
no significant differences among the 
mean gains. The enigma posed by this 
situation in which the no-training stu- 
dents improved for all practical purposes 
as much as the others could arise from 
either of two causes: improvement in 
intelligibility was limited to an amount 
resulting from an awareness of a com- 
munication problem; or, inadequate 
training procedures, including time, 
were used. The training represented 
the best guesses of professional speech 
teachers and took into account the 
recommendations of flying instructors, 
experience with communication in flight, 
researches of other laboratories, and 
earlier experience with communication 
problems over service equipment. 

In the light of later developments 
these best guesses regarding the contents 
and methods of teaching communication 
were not highly suitable. First, undue 


much lecture and not enough drill in 
use of voice communication _ skills. 
Fourth, rate of speaking and pitch of 
voice were over-emphasized. And fifth, 
too much latitude was permitted in posi- 
tioning microphones during the train- 
ing sessions. 

At the expense of belaboring the 
point, it is noted that the instructors, 
fresh from college teaching, viewed the 
prospect of improving intelligibility in 
aircraft through training as a challenge, 
and worked hard to achieve favorable 
results in the experimental training. The 
prospect for success was viewed skepti- 
cally by many persons both in the army 
and NDRC. It was a discouraged group 
of teachers who viewed the results of the 
first efforts to improve intelligibility by 
training. Service communication equip- 
ment, testing techniques, lack of time 
for instruction, and poor teaching on the 
part of “the other fellow” were variously 
blamed. Nothing had been more effec- 
tive than giving a word intelligibility 
test to students, thus giving them direct 
evidence that they were unable to make 
themselves understood effectively in the 
presence of high level noise, and inclin- 
ing them to take communication prob- 
lems seriously. 
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The ‘awareness-of-a-problem” argu- 
ment is especially used in advocating ex- 
tensive reliance upon recorder-reprodu- 
cers with training in speaking. The ar- 
gument seemed cogent in view of the 
possible failure of formal training and 
led to experiments to determine the 
effectiveness of students’ practicing with 
recording-reproducing devices without 
supervision. The results of two investi- 
gations were inconclusive. In one in- 
stance groups of 7-8 students recorded 
standard voice messages and listened to 
their records throughout a_ six-hour 
training period without significant re- 
sults. In the other, 2-3 students worked 
similarly and showed significant im- 
provement on their second, that is, final 
intelligibility test. It was not possible 
to follow up this work. From an ad- 
ministrative standpoint the technique 
was limited to an auxiliary training 
method. 

Other papers in this series explain 
how course contents were found to be 
consequential or irrelevant to intelligi- 
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TABLE III.—ImproveMentT IN INTELLIGIBILITY AFTER THREE Hours TRAINING UNDER 
Noise CONDITIONS. 


bility training. Similarly, methods of 
presentation were evaluated, some more 
directly than others. For example, a sig- 
nificant instructor variable in terms of 
mean class increment led to presenta- 
tions of minimum explanations through 
a set of recordings as explained in a 
previous paper. These recordings 
seemed to be effective and to improve 
instruction. Successive revisions were 
made on a basis of common sense and 
observation, not experimentation. 
Simulated airplane noise, however, 
was called for in the training primarily 
because its value was experimentally de- 
termined. Six groups of students were 
tested for intelligibility before and after 
three hours of practicing routine voice 
messages under supervision. Each group 
worked in a special noise condition, the 
extremes being all drill in normal-quiet, 
and all drill in high-level noise. As in 
other studies, all groups made significant 
improvement. However, as shown in 
Table III, the students who worked in 
noise for one-half of the first hour and 


a. Initial, Final and Gain Scores. 


N = maximum noise 108-110 db during all drill. 
Q = no noise. 
Y% = full noise during 14 of drill period. 
Initial Final 
Noise Condition N Mean oN Mean o™M Gain 
peur: 32 
27 53-9 70.2 1.4 16.3 
NNN 30 45.0 2.3 60.3 1.6 15.1 
8eaQ 31 52.4 2.1 63.2 1.9 10.8 
yywyyY 27 46.0 2.4 56.5 1.9 10.5 
QQN 26 53-3 2.1 63.8 1.9 10.4 
NQOQO 28 50.1 2.0 59.0 1.8 8.9 
Total 169 50.1 62.1 12,0 
F methods/error = 3.00 (5% == 2.27; 1% = 3-14) 
b. Significance of Differences Between Gain Scores. 
Diff. Diff. 
Y,NQN—NNN 1.2 0.49 NNN —QQO 43 1.83 
3-5 2.28 — 4.6 1.89 
58 2.32 —QON 46 1.87 
— QQN 58 2.31 —NOQQ 5.1 2.53 
7:3 2.95 


“t” (5% = 2.05) 
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all of the final, or third, hour made the 
greatest gain, an improvement that was 
significantly greater than all others with 
the exception of that made under the 
all-noise condition. 

In the early stages of developing a 
voice-communication training program, 
the instructors thought that more time 
-was needed for instruction than the 10 
hours that were available for trial pur- 
poses. This feeling increased with the 
futility of the first training efforts. On 
the other hand it was apparent that not 
much time was available in the crowded 
war-time training schedules. Therefore, 
after an effective course had been built 
an attempt was made to determine with- 
in practical limits the relation between 
the measured effect of training and the 
length of the course. Eight groups of 
students, each with approximately 40 


members, were measured for intelligibil- 
ity after being trained for different 
amounts of time up to eight hours. Three 
of the groups were given no training, 
only being tested at intervals during the 
experimental training period. Table IV 
shows that all trained classes were supe- 
rior to the others by highly significant 
differences. However, there is no incre- 
ment in average score corresponding to 
training beyond three hours. Caution is 
necessary in interpreting this set of re- 
sults because of the necessary lack of 
pre-training scores. There could have 
been significant differences among the 
groups at the outset, but the trend of the 
data argues strongly that training beyond 
three hours was not effective. These re- 
sults were interpreted to justify no more 
than four hours of training, this to in- 
clude intelligibility testing. 


TABLE IV.—INTELLIGIBILITY ScorFs RELATED TO AMOUNT OF TRAINING. 


Hours of 
Hour Subject Group Instruction 
All Groups 


Control Group A 


Training Group A 
Training Group B 
Training Group C 


Control Group B 
Training Group D 
Control Group C 


Training Group E 
Trained Subjects 
Untrained Subjects 
Difference 
Significance of Difference “t” — 


Orientation 


13.9 (1% = 


Mean Intelligibility 
Score 


50.8 


66.0 
51-7 


15.1 


2.56) 


TABLE V.-—EFrects oF Voice COMMUNICATION TRAINING Upon SPEAKERS OF DIFFERENT 
Pre-TRAINING ABILITY. 


Decile N Final Score 


74-93 


Gain 


4-72 

9-07 
10.43 
14-93 
13.71 
21.29 
21.07 
26.22 
24.78 
35-79 


Initial Score 
70.21 
63.64 
60.57 
57-21 
54.36 
51 1 
48.50 
43-64 
39 93 
30.07 


CoO + OO 


52.6 17-4 
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395 
— — — 
4 2 37 60.8 11.5 s 
5 3 36 79-7 7.0 
6 4 40 68.6 7:9 
40 54-2 10.0 
7 5 35 67.7 6.6 
8 40 50.1 12.4 
186 9.0 
119 10.8 ai 
72.71 
72.14 
68.57: 
73-00 
69.57 
69.86 
64.71 
1 65.86 
Total 
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The effectiveness of the training pro- 
gram with regard to intelligibility is 
illustrated in an analysis of a representa- 
tive class of 141 student pilots. Before 
training, the class had a mean intelli- 
gibility score of 52.6 (S. D., 11.4). After 
four hours of training, mean, 70.0 (S. D., 
8.5). Table V shows the gain of the 
class by deciles, divided according to 
pre-training intelligibility scores, and 
serves to explain the decrease in vari- 
ability that accompanied training. 

This example typifies the results of 
laboratory training conducted by speech 
teachers. If the program was to be of 
value it had to operate under the direc- 
tion of army instructors. Table VI shows 
that the results cited above were not 
unlike those obtained in field use where 
the instruction was given by teachers 
who had little speech training other than 
the one week’s indoctrination given at 
the Laboratory or at the station. Often 
these instructors were experts in radio 
code. Although only five centers were 


surveyed for more than indicative re- 
sults, they appeared to represent what 
was happening generally in basic train- 
ing installations. 

Another aspect of the effectiveness of 
training is the retention of an acquired 
skill. Follow-up tests were given to two 
classes 30 days after they were trained 
in voice communication. The continued 
improvement in mean _ intelligibility 


scores was significant in both instances, 


but could be explained on a basis of 
experience with the test. It indicated, at 
least, that the improvement that result- 
ed from training was retained. 

Thus, this training program brought 
about general improvement in voice in- 
telligibility through short periods of 
training. The less effective students made 
considerably greater gains than the ones 
who were initially proficient. Over-all, 
the results clearly show the advantages 
of using experimental methods in arriv- 
ing at course contents and _ teaching 
methods. 


TABLE V1.—INTELLIGIBILITY SCORES OF SPEAKERS TRAINED’ IN COMMUNICATION AT 
Five AAF CENTERS. 


Type of 
Training 
Center 


Untrained Mean 


Trained Mean 


67.1 
76.7 
81.7 
85.4 
84-7 


Pilot 

Pilot 
Gunner 
Navigator 
Bombardier 


49-3 
59-6 
67.7 
65.9 
68.6 


ky 
Mean 
329 174 
128 17-1 
543 
229 19.5 
q 78 16.1 
me 


